Artificial Intelligence Depot
Visiting guest. Why not sign in?
News, knowledge and discussion for the AI enthusiast.
FEATURES COMMUNITY KNOWLEDGE SEARCH  
Data Extraction
 
• Data Extraction

Hello All,

I am trying to build an HTML intelligent parser(web agent) that is able extract meaningful data from web pages (HTML). I have tried many traditional(non-intelligent) methods, like using HTML simple parsers which only remove the HTML tags and this leaves me all the information. I have tried Regular Expressions but it gave poor since the data is displayed presentation varies greatly from one page to another. I finally realised that my agent has got to have some intelligence, my knowledge about neural networks is very basic and therefore I couldnt tell whether or not what I am trying to do is possible using neural networks and AI or not.

Below is the case that I am trying to solve:
I am looking for houses I need the agent to access many(100+) mortgage sites and out of each site extract the following data: house address, price, size, and number of rooms, assuming this is the data required for all houses and I have the pages for the mortgage sites stored on my machine, is neural networks and AI suitable to solve my problem ?

Suggesstions are welcomed.

Regards,

Hussam Galal
hussam.galal@gmail.com

1 posts.
Wednesday 07 December, 05:09
Reply