pdfHandler module¶
-
class
pdfHandler.
CharData
(character, bb0, bb1, bb2, bb3)¶ Bases:
object
Hold properties of characters, which are extracted from the pdf data.
-
class
pdfHandler.
PageData
¶ Bases:
object
Hold the data of each page including sentences and characters.
-
addChar
(c)¶ Add character data to the list.
- Parameters
c – Added character.
- Returns
None
-
addSentence
(s)¶ Add sentence data to the list.
- Parameters
s – Added sentence.
- Returns
None
-
-
class
pdfHandler.
PdfHandler
(pdfPath)¶ Bases:
object
Handle whole pdf data.
-
generateHighlightedPdf
()¶ Generate highlighted pdf with respect to each color and annotating text of it.
- Returns
None
-
getSentence
()¶ Gets all sentences of the pdf data.
- Returns
Whole sentences.
-
makeSentence
()¶ Make sentences from extracted characters.
- Returns
None
-
textExtracWithCoord
()¶ Extract each character from pdf data. The character and its coordinates are extracted.
- Returns
None
-
-
class
pdfHandler.
SentenceData
(sentence, rectList, pageNum)¶ Bases:
object
-
setAnnotation
(annotText)¶ Set annotate text for pdf. The text will be annotated in the pdf data.
- Parameters
annotText – Annotate text.
- Returns
None
-
setColor
(color)¶ Set the annotation color for the sentence.
- Parameters
color – The color
- Returns
None
-
setRectList
(offset)¶ Define original coordinate for pdf.
- Parameters
offset – The offsets from original coordinate.
- Returns
None
-