Hsiang: Yansanmo: depends on text alignment iirc
Nesin: Literphor: I have the data from this PDF as a JSON object
Groombridge: PDFs are garbage for extracting info from, its an exercise in futility but good luck.
Harpel: Gillice, Did you make the JSON object?
Weidenheimer: Gillice, what is the ‘x’ of “4016/3 .” ?
Victorica: Literphor: I used pdf2json for that
Brecht: Gillice, THANK YOU! That would be the answer to my question
Willougby: Literphor: I did mention it 😛
Peedin: Yansanmo: that’s 20.858 together with the rest of the description
Ruthstrom: And then what is the ‘x’ of “7255 CRAB” ?
Egidio: X: 20.858, y:18.582, text: 4016/3 CRAB CAPITAL DP SWITCH
Sinisi: Filter all the ‘x’ = 20.858
Iwanejko: Is anyone here familiar with working in iAd Producer?
Skwara: Yansanmo: it’s probably not what I’m going to do, but that’s basically how I’m approaching it
Keesey: If two strings are w/in some % of each other on the Y axis?
Tacderan: I’m likely indeed going to use the fact the 2nd line is within consistent distance of the 1st
Gleaves: Man i love it when a plan comes together
Gleaves: But no one around here knows the troubles and elation of accomplishing programming problems…
Heller: I know exactly how to solve it, I’m just completely blocking
Daigle: That data structure looks HORRIBLE! Good luck dude. You might be better off parsing it yourself
Oldham: I don’t speak JS anymore D:
Evanski: Literphor: really? I found it surprisingly convenient
Brech: I didn’t even expect to get proper chunks of texts from a PDF
Northam: Gillice, Boks have a natural order from left to right top to bottom. Having XY coordinates seems super complicated to iterate over, why not just preesent the data in the order it’s gathered?
Holman: I just wouldn’t want to work with it. I sure as hell don’t want to think of a book in XY coordinates
Iwanejko: Anyone know how to make sifting through obfuscated JS easier?
Kapuscinski: Literphor: I will use the order as well
Bernskoetter: Literphor: first I’ll find the first item of the order Line# in the pic, and then I’ll grab everything from that same y-coordinate
Tiotuico: That’ll reliably be Line#, Quantity, Item Number, Unit, VAT Code, Price, Pricing Quantity and Line Total
Newill: Gillice, Oh I see so it does maintain it’s order? The XY are just extra bits of information then?
Martincic: Literphor: oh yes, it’s not completely random
Howk: Literphor: but it indeed reads from left to right, top to bottom, dumbly
Kitelinger: Literphor: invoices often use columns, in which case data obviously ends up scrambled a bit
Jacinthe: It’ll read from column 1, then column 2, then column 1 next line, etc.
Neel: The one thing I don’t really like is that I don’t really have another way of figuring out what an order line is other than its unique x position
Lavanchy: If they ever change their invoice so another item is at that x, it’ll cause some trouble
Bjorkman: I guess I could make an exclusion list though
Iwanejko: I know this seems like trolling, but I’m actually serious. I’m trying to figure out how an iBook widget is getting its checkboxes to work, and I need to sift through its unminified but obfuscated JS to figure it out.
Hocutt: Iwanejko: can’t you just make something similar?
Bonomi: What do you want to know about it? how it’s animated or its actual action?
Iwanejko: How they got the checkboxes to function properly.
Iwanejko: It’s a little hard to explain.
Iwanejko: IAd Producer and iBooks Authors Widgets allow you to use HTML/CSS/JS to build interactive components.