Hey there! Sign in to join this conversationNew here? Join for free
x Turn on thread page Beta

Is it possible to teach computer to read documents? watch

    • Thread Starter
    Offline

    15
    ReputationRep:
    With the current technology can computers be programmed read documents and understand them? lets say if we throw them a science journal the software can differentiate where is the date, the title and stuff.
    Offline

    2
    ReputationRep:
    Why do you care?
    Offline

    3
    ReputationRep:
    (Original post by HucktheForde)
    With the current technology can computers be programmed read documents and understand them? lets say if we throw them a science journal the software can differentiate where is the date, the title and stuff.
    Search for 'optical character recognition'.

    Some software can do this already, and some software can already pick up context easily.

    So yes.
    • Section Leader
    • Peer Support Volunteers
    Offline

    21
    ReputationRep:
    Section Leader
    Peer Support Volunteers
    Moved to a more appropriate section!

    Yes, this is possible. It's even possible nowadays for software to read and describe a picture!
    Offline

    3
    ReputationRep:
    "Understand" is a strong word.
    • Thread Starter
    Offline

    15
    ReputationRep:
    (Original post by oinkk)
    Search for 'optical character recognition'.

    Some software can do this already, and some software can already pick up context easily.

    So yes.
    OCR only convert picture to text. what i mean is actually reading and know what it is reading..
    Offline

    17
    ReputationRep:
    Well yes, it is. Just don't expect it to take part in a debate on the subject of the document afterwards...
    Offline

    17
    ReputationRep:
    Yes, absolutely. Once you've converted it to text, text parsing is pretty easy, especially if we're talking a formal written document (so written with consistently correct grammar/spelling).

    (Original post by Talon)
    Well yes, it is. Just don't expect it to take part in a debate on the subject of the document afterwards...
    We could probably do that, if we wanted. We've solved more difficult AI problems. That's just not a very interesting one: you just train it with everything ever written about that subject ever, and get it to patch together previous arguments.
    • Thread Starter
    Offline

    15
    ReputationRep:
    (Original post by BlueSam3)
    Yes, absolutely. Once you've converted it to text, text parsing is pretty easy, especially if we're talking a formal written document (so written with consistently correct grammar/spelling).



    We could probably do that, if we wanted. We've solved more difficult AI problems. That's just not a very interesting one: you just train it with everything ever written about that subject ever, and get it to patch together previous arguments.
    How would it be like ? Can it be programmed with common programming language like vb.net/C++/java? do we need to program a neutral net work? Or can we just parse it using some standard rules like If it has numbers and has 2 "/" it is a date?

    also i am wondering will it be able to read doctor's handwriting?
    Offline

    17
    ReputationRep:
    (Original post by HucktheForde)
    How would it be like ? Can it be programmed with common programming language like vb.net/C++/java? do we need to program a neutral net work? Or can we just parse it using some standard rules like If it has numbers and has 2 "/" it is a date?
    Neural networks are one of the more natural ways to do it, yes. Formal documents you can probably parse via more simplistic pre-programmed rules, but less formal things will require some form of learning algorithm. You could absolutely program this in any programming language you like (they're all Turing-complete, so you can do anything you like in any of them).

    also i am wondering will it be able to read doctor's handwriting?
    Depends how bad the handwriting is, how good the camera is, and how good the software is. Given a decent quality camera, a decent learning algorithm and a sizeable body of training data from that doctor to work with, yes.
    Offline

    0
    ReputationRep:
    That is a very interesting question.

    I think at this point in time it's becoming more and more feasible this idea of getting a computer system to read text and understand the context of the text. There are a few problems with this though, there are a lot of things written out there, you can classify these things as fact or opinions, even some opinions are passed as facts, it only becomes a fact if it can be proved, and human's in the pass have written about "provable" facts which were later proved to be wrong. So if we try to teach a machine to learn everything in the world by reading every possible document, would then the machine become very confused? or would the machine be able to formulate it's own judgments about things?

    A system that could read every written work in the human history, could be used to create expert systems. Instead of using an expert in a particular area to provide facts and rules to an expert system, we could have a machine learning algorithm that would create these facts and rules instead providing them to the machine which becomes more knowledgeable in a particular area.

    One thing that might be possible in the future, is for machines to be able to detect contradictions in people's work. This may help machines with the first problem I mentioned earlier, when there are multiple theories for a particular problem and to make decisions such as "what is the best answer to this question?". Obviously, this has not been done yet as it is a very generic problem. A lot more work has to be done in the realm of science to tackle even simpler problems than this. Machine learning algorithms of today have improved a lot in the last 15 years but they are still relatively slow to the human brain. Machines are pretty good with logic (maths and so and outperforms any human being) but the human brain is powerful beyond words.
    • Thread Starter
    Offline

    15
    ReputationRep:
    (Original post by BlueSam3)
    Neural networks are one of the more natural ways to do it, yes. Formal documents you can probably parse via more simplistic pre-programmed rules, but less formal things will require some form of learning algorithm. You could absolutely program this in any programming language you like (they're all Turing-complete, so you can do anything you like in any of them).



    Depends how bad the handwriting is, how good the camera is, and how good the software is. Given a decent quality camera, a decent learning algorithm and a sizeable body of training data from that doctor to work with, yes.
    Interesting I am still trying to grasp what those terms like neural network, algorithm means, I think there is so much research to do.

    I am trying to write a program that can read handwritings.

    tbh i am just so into robotics and artificial intelligence. I hope i can get a job in this industry but i dont know how.
    • Very Important Poster
    • PS Reviewer
    Offline

    21
    ReputationRep:
    Very Important Poster
    PS Reviewer
    https://support.office.com/en-us/art...a-753eed6d7424

    If you have documents in word this is a great tool (I use it on my reports - if MSWord can pick out the key points then so can most audiences - and on really LONG documents if I can't be arsed reading them in full).
    Offline

    17
    ReputationRep:
    (Original post by HucktheForde)
    Interesting I am still trying to grasp what those terms like neural network, algorithm means, I think there is so much research to do.

    I am trying to write a program that can read handwritings.

    tbh i am just so into robotics and artificial intelligence. I hope i can get a job in this industry but i dont know how.
    As far as getting a job goes: go and do a degree in CS/maths, and learn a metric **** ton about this stuff specifically (probably involving doing a masters, maybe a PhD).
    Offline

    0
    ReputationRep:
    (Original post by HucktheForde)
    Interesting I am still trying to grasp what those terms like neural network, algorithm means, I think there is so much research to do.

    I am trying to write a program that can read handwritings.

    tbh i am just so into robotics and artificial intelligence. I hope i can get a job in this industry but i dont know how.
    You don't need a degree to be able to do anything you would like to do. It is usually harder because you don't have the guidance and the support from professors but it's not impossible. I found this which has a lot of information to take in, but it's a difficult subject and the writer really did a good job putting it all together in 1 chapter. There are sample codes as well as a lot of theory, have fun

    http://neuralnetworksanddeeplearning.com/chap1.html
    Offline

    17
    ReputationRep:
    (Original post by cattleofra)
    You don't need a degree to be able to do anything you would like to do. It is usually harder because you don't have the guidance and the support from professors but it's not impossible. I found this which has a lot of information to take in, but it's a difficult subject and the writer really did a good job putting it all together in 1 chapter. There are sample codes as well as a lot of theory, have fun

    http://neuralnetworksanddeeplearning.com/chap1.html
    What he wants to do, though, is the cutting edge of current AI research. He's going to have to get to the research front somehow, and a PhD is pretty much the way to get there.
 
 
 
Poll
Do you agree with the proposed ban on plastic straws and cotton buds?

The Student Room, Get Revising and Marked by Teachers are trading names of The Student Room Group Ltd.

Register Number: 04666380 (England and Wales), VAT No. 806 8067 22 Registered Office: International House, Queens Road, Brighton, BN1 3XE

Write a reply...
Reply
Hide
Reputation gems: You get these gems as you gain rep from other members for making good contributions and giving helpful advice.