Skip to content

Bug duplicated linebbox

Clemens Hutter requested to merge bug_duplicated_linebbox into master

Solved

  • ignore last page if doctype 2 or 3
  • if there are multiple same textlines in 02_xml, all but one is deleted in data/logs there is a csv file created with all the deleted duplicates
  • when we call .xml on a Document object the corrected xml is automatically recomputed if it is older than CONSTANTS.XML_versions
  • in data/test_set_file_ids there are now files containg ids of documents we are gonna use for testing

You can delete the branch afterwards

Merge request reports

Loading