UCFindTextBreak
Uses locale-specific text-break information to find boundaries in Unicode text.
Declaration
OSStatus UCFindTextBreak(TextBreakLocatorRef breakRef, UCTextBreakType breakType, UCTextBreakOptions options, const UniChar *textPtr, UniCharCount textLength, UniCharArrayOffset startOffset, UniCharArrayOffset *breakOffset);Parameters
- breakRef:
A valid reference to a text-break locator object. If the type of boundary specified by the
breakTypeparameter isBreakChar, you can passNULL. You use the function 1390362 Uccreatetextbreaklocator to obtain a text-break locator object reference. If non-NULL, the text-break locator object must support the type of boundary specified in thebreakTypeparameter. - breakType:
A value of type
UCTextBreakType, with exactly one bit set to specify a single type of boundary to be located. Since support for finding character boundaries is locale-independent and built into theUCFindTextBreakfunction, if you specifyBreakCharas the type of boundary, then thebreakRefparameter is ignored and may beNULL. - options:
A
UCTextBreakOptionsvalue to specify the operation of theUCFindTextBreakfunction. You can use text-break locator options to control some location-independent aspects of a text-boundary search. Note that if you do not specify anyUCTextBreakOptionsvalues,UCFindTextBreaksearches forward, but assumes that thestartOffsetvalue refers to the character preceding the offset rather than the one at the offset. This can result inUCFindTextBreakreturning an offset that is equal to the start offset. - textPtr:
A pointer to the initial character of the Unicode string to search.
- textLength:
The total count of Unicode characters in the string to search.
- startOffset:
A
UniCharArrayOffsetvalue specifying the offset from whichUCFindTextBreakis to begin searching for the next text boundary of the type specified in thebreakTypeparameter. IfstartOffset == 0thenkUCTextBreakLeadingEdgeMaskmust be set in the options parameter; ifstartOffset == textLengththenkUCTextBreakLeadingEdgeMaskmust not be set. - breakOffset:
A pointer to a
UniCharArrayOffsetvalue. On return, the value pointed to by thebreakOffsetparameter is set to the offset of the text boundary located byUCFindTextBreak. In normal usage (when exactly one ofkUCTextBreakLeadingEdgeMaskandkUCTextBreakGoBackwardsMaskare set), the result returned inbreakOffsetis not equal to that supplied in thestartOffsetparameter unless an error occurs (and the function result is other thannoErr). However, whenkUCTextBreakLeadingEdgeMaskandkUCTextBreakGoBackwardsMaskare both set or both clear, the result produced inbreakOffsetcan be equal to the value ofstartOffset.
Return Value
A result code. The text-break locator referenced by the breakRef parameter must support the type of boundary specified in the breakType parameter; otherwise, the function returns kUCTextBreakLocatorMissingType.
Discussion
The UCFindTextBreak function starts from a specified offset in a text buffer, and then proceeds forward or backward (as requested) until it finds the next text boundary of a particular locale-specific type, using a given set of options. The different types of breaks or boundaries in a line of Unicode text can include
Boundaries of characters (treating surrogate pairs as a single character).
Boundaries of character clusters. A cluster is a group of characters that should be treated as single text element for editing operations such as cursor movement. Typically this includes groups such as a base character followed by a sequence of combining characters, for example, a Hangul syllable represented as a sequence of conjoining jamo characters or an Indic consonant cluster.
Boundaries of words. This can be used to determine what to highlight as the result of a double-click.
Potential line break locations.
Finding boundaries of characters is a locale-independent operation, and support for it is built directly into the UCFindTextBreak function. If that is the only type of text boundary that you wish to locate, it is not necessary to call UCCreateTextBreakLocator and create a text-break locator object.
When finished with the text-break locator object, dispose it using the function UCDisposeTextBreakLocator.