Well if you know the first.

 
Pavolini: Ill dig into it some more. ive got some good info. thanks again

Aiken: Http://i.imgur.com/EP8c37z.png — hmm so these orders have information on 2 lines. I have the x/y coordinates of all individual text strings. I wonder what the most reliable way would be to find the 2nd line for each order

Millikin: I don’t know mongoose, but if they aren’t giving you a clear way to sync their asyncs then you might want to look for a different solution. Frameworks should be somewhat simple to use, no?

Harrer: Heck I need to figure out a better way to determine what orders are as well, I’m currently just using the fact the first items of the orders have a unique x coordinate

Guiab: Ahh gotcha. okay im probably just not looking in the right spot

Zegarra: Are you doin ocr or something?

Mintken: Gillice, Are you scraping data off a site?

Lippman: Thankfully the text data is readily in the PDFs and I don’t need to recognize each charater

Zinn: I do need to give meaning to the hundreds of loose text strings though

Gantzler: I have a huge JSON with all the bits and their coordinates

Ebbs: So {‘text’: 20, ‘x’: 3.755, ‘y’: 54.83333}, and then several hundred or thousand of them

Kolp: I don’t get what the PDF “coordinates” are for

Kusch: They indicate where a bit of text is placed

Knower: Gillice, I have no idea what PDF data would look like.but if everything is textual, why not just stream the text and grab the text that you need?

Kupper: Gillice: and why do you care where its placed?

Norrington: Actual question: what are you trying to do, long version.

Floriano: Because that’s about the only way to figure out what a piece of text might be

Eckhardt: I see ‘Payment Due Date: 30-11-15’ and know what it means

Leapheart: You’re trying to find context for strings you’ve extracted from a PDF?

Fearon: But in this JSON I just have two strings ‘Payment Due Date’ and ’30-11-15′ not necessarily together

Kilcoyne: Gillice, Are you trying to visually sc**** the data or something? You should be able to just stream over the data and grab what you need

Lujan: Literphor: the grabbing what I need is the problem

Damphousse: Is there a way to programmatically get the pages a user has liked on facebook?

Ayele: He’s trying to give semantic meaning to his data

Hoang: Gillice: if this is legit, you need to ask for data source that doesnt start at PDF.

Windon: Then I wouldn’t get paid because there wouldn’t be anything to do

Bellino: Are you using ps2ascii or something?

Heskett: Gillice, what if you did some funky math to check whether a string’s coordinates are in X area to the bottom left of the first string

Noordam: Steeze, The thing is if he’s streaming over the data then having X or Y coordinates doesn’t make much sense

Parrett: Or if all the strings are in similar format based on what they represent, create some fancy regex checks or something?

Gailes: So I’m confused on what he’s actually doing

Scarpelli: Literphor, good point

Hyun: I don’t get it, streaming over the data?

Hanekamp: He’s taking a PDF and extracting info from it, but PDF is a terrible data format to p**** in any way

Aspri: Steeze: regex isn’t going to work here, there’s no way to predict what an order might be

Deneal: I did this successfully, it’s not too hard

Tecchio: But now 2 lines per order 😀 have to give it some thought

Morasca: I guess all the x must be the same

Staie: Gillice, Well the PDF is a large file of data.unless youre using a tool that p****s the data for you then I’m ***uming you’re streaming over the data and trying to p**** it yourself

Neiss: Well if you know the first line, wouldnt it not be that hard to find a string with coordinates inside x amount of area positiioned in y way to your first line?