+
4.1.2 Segment Break Transformation Rules |
seg-break-transformation-000 |
|
Script |
Whitespace and line break transformation
- All spaces and tabs immediately preceding or following a segment break are removed. If no F, H, W or ZWSP characters involved, the segment break is converted to a space.
|
seg-break-transformation-001 |
|
Script |
Wide characters around line break
- If the East Asian Width property of both the character before and after the line feed is W and neither side is Hangul, then the segment break is removed.
|
seg-break-transformation-002 |
|
Script |
Fullwidth characters around line break
- If the East Asian Width property of both the character before and after the line feed is F and neither side is Hangul, then the segment break is removed.
|
seg-break-transformation-003 |
|
Script |
Halfwidth characters around line break
- If the East Asian Width property of both the character before and after the line feed is H and neither side is Hangul, then the segment break is removed.
|
seg-break-transformation-004 |
|
Script |
Won and halfwidth characters around line break
- If the East Asian Width property of both the character before and after the line feed is F or H and neither side is Hangul, then the segment break is removed.
|
seg-break-transformation-005 |
|
Script |
Wide character and non-wide character around line break
- If the East Asian Width property of only one character before and after the line feed is F, W or H and neither side is Hangul, then the segment break is converted to a space.
|
seg-break-transformation-006 |
|
Script |
Fullwidth character and non-fullwidth character around line break
- If the East Asian Width property of only one character before and after the line feed is F, W or H and neither side is Hangul, then the segment break is converted to a space.
|
seg-break-transformation-007 |
|
Script |
Halfwidth character and non-halfwidth character around line break
- If the East Asian Width property of only one character before and after the line feed is F, W or H and neither side is Hangul, then the segment break is converted to a space.
|
seg-break-transformation-008 |
|
Script |
Wide and fullwidth characters around line break
- If the East Asian Width property of both the character before and after the line feed is F, W or H and neither side is Hangul, then the segment break is removed.
|
seg-break-transformation-009 |
|
Script |
Fullwidth and halfwidth characters around line break
- If the East Asian Width property of both the character before and after the line feed is F, W or H and neither side is Hangul, then the segment break is removed.
|
seg-break-transformation-010 |
|
Script |
Hangul characters around line break
- If the East Asian Width property of both the character before and after the line feed is F, W or H and neither side is Hangul, then the segment break is removed. Otherwise, the segment break is converted to a space.
|
seg-break-transformation-011 |
|
Script |
Hangul jamo characters around line break
- If the East Asian Width property of both the character before and after the line feed is F, W or H and neither side is Hangul, then the segment break is removed. Otherwise, the segment break is converted to a space.
|
seg-break-transformation-012 |
|
Script |
Hangul halfwidth jamo characters around line break
- If the East Asian Width property of both the character before and after the line feed is F, W or H and neither side is Hangul, then the segment break is removed. Otherwise, the segment break is converted to a space.
|
seg-break-transformation-014 |
|
Script |
Thai characters around line break
- If the East Asian Width property of both the character before and after the line feed is F, W or H and neither side is Hangul, then the segment break is removed. Otherwise, the segment break is converted to a space.
|
seg-break-transformation-015 |
|
Script |
Thai and Latin characters around line break
- If the East Asian Width property of both the character before and after the line feed is F, W or H and neither side is Hangul, then the segment break is removed. Otherwise, the segment break is converted to a space.
|
seg-break-transformation-016 |
|
Script |
Thai with ZWSP before line break
- If the character immediately before or immediately after the segment break is the zero-width space character (U+200B), then the break is removed, leaving behind the zero-width space.
|
seg-break-transformation-017 |
|
Script |
Thai with ZWSP after line break
- If the character immediately before or immediately after the segment break is the zero-width space character (U+200B), then the break is removed, leaving behind the zero-width space.
|
white-space-collapse-000 |
|
Script |
White space collapse
- Every tab is converted to a space. Any space immediately following another collapsible space is collapsed to have zero advance width.
|
white-space-collapse-001 |
|
Script |
White space and non-ASCII spaces
- Any space immediately following another collapsible space is collapsed to have zero advance width. Only refers to U+0020, not other Unicode spaces.
|
white-space-collapse-002 |
|
Script |
Whitespace and bidi control characters
- All spaces and tabs immediately preceding or following a segment break are removed, ignoring bidi formatting characters as if they were not there.
|