Helfenstein: Gleaves: see that ‘data:’ at the start? the whole PDF is in the URL, encoded in base64
Alonza: Because when you order a car, it won’t have the same order description as when I order a pot of peanutbutter
Cyrnek: So there is only one element which contains lots of text – so now you need to extract the text
Mcnair: And the quantity ’20’ could also be part of a phone number
Blais: From this one element
Beech: Gleaves: is that a page you’re generating?
Shealey: So the only pattern is the position of the text of this single element
Soho: The only way for me to know what text has what meaning is by looking at the original PDF, and establishing a profile with exact coordinates
Guilianelli: The easiest thing which comes to mind would be probably macroing the mouse to drag-select at that position
Sovak: That requires a manual action
Bergamine: Gillice: the pdf lib API you are using doesn’t offer a method for this?
Fuglsang: Could there be a better suited pdf api?
Molony: This is already very useful
Bolender: I just need to find the most efficient way to search the stack
Lutterman: So you get the text of that element and now you have to find out where the part of text is that wold be on that position
Korbel: I have absolutely no issue finding these elements, it already works in v1 of this tool
Sobus: But now there are a few more variables to take into account
Breidenbach: Hashtag__: I have the text and the position
Schwipps: And I have a profile that tells me the position and its meaning
Deringer: So you have to find out what kind of combination it is ?
Supino: So I need to compare this huge stack of random texts to the profile
Nanka: Like two text fields with this kind of text means bill of this kind
Shreck: You iterate over the elements and check whether they are the elements for the given position?
Deavers: I have a profile that tells me the phone number is at 12.04, 13.564 for example
Mangione: Which isn’t too hard for things like single phone numbers
Raul: And the pdf api got no such function “give me element at position xy”
Gruhlke: Because that’s 1 item at a fixed position for every single invoice
Giarratano: You have to make it yourself
Josue: Hm, a for loop over the items and checking the position?
Emmerling: The fun part is orders
Carson: Does anyone think phantomjs is good enough. Or do you really want to test your site against real browsers?
Kules: Because there might be 1 order, there might be 20 orders
Domhoff: Carson: a real end to end test would involve real browsers the end users are using
Abramowski: And they might be on 1 row or 2
Battani: And they might be in columns or not
Luker: So what I do at the moment is find the x-coordinate of any order, which is the same for every order but otherwise unique in the invoice
Vulich: Then I’ll also know its y-coordinate, and I know the rest of the row will have the same y-coordinate
Armada: So I need to search the stack for that y-coordinate
Vandenbergh: Carson: it’s not one or the other: use one for speed, and the other for accuracy
Packard: Hashtag__: yes, I could just go through the original stack a million times until I have all the data
Pacini: But that’s very expensive
Henslin: You can use a hash or something like that
Stablein: So I’m trying to figure out how to reduce the amount of lookups and the expense of each lookup
Kingma: For a cheap look up of a particular field
Brueckman: A small internal database may help
Carson: I guess accuracy trumps in my case. it’s just such a h***le to set up selenium