The APIs related text enable searching, counting, exporting and extracting text in PDF documents. ZSTextPage_Load needs to be called in advance to retrieve text contents related to a specific PDF page before text processing is peformed. Here are some common APIs for text processing listed in Table 3.14. For a complete list of APIs, please refer to ZSTextPage.h.


Table 3.14

  API Name Description
1 ZSTextPage_Load Prepare the information of all characters in a page.
2 ZSTextPage_Release Release all resources allocated for a PDF text page handle.
3 ZSTextPage_NumChars Get a count of characters in a page. Generated characters, additional space and line breaks are also counted.
4 ZSTextSelection_GetChars Extract text from a selected area on a page.
5 ZSTextPage_ExportToFile Export text content in a page to a specific file handle.
6 ZSTextSearch_FindNext Search for text from the beginning of the document to the end.
7 ZSTextSearch_GetSelection Get a text selection handle from a text search when a match is found.
8 ZSTextLink_GetLink Get the URL hyperlink.
9 ZSTextLink_NumLinks Get a count of the text in URL format on a page.
10 ZSTextPage_SelectByRange Get a text selection handle by specific character range.
11 ZSTextPage_StartSearch Start a search for text in PDF document.

 


This is an example of how to do text search using APIs.

Example: Search a text pattern in a page

ZSTextPage textPage = NULL;

ret = ZSTextPage_Load(page, &textPage);

if (ZS_OK == ret)

{

ZSTextSearch textSearch = NULL;

ret = ZSTextPage_StartSearch(textPage, &searchPattern, 0, 0, &textSearch);

if (ZS_OK == ret)

{

ZSBool isMatch = ZS_TRUE;

ZSTextSearch_FindNext(textSearch, &isMatch);

}

}