Is there a way to determine if the request coming to a handler (lets assume the handler responds to get and post) is being performed by a real browser versus a programmatic client?
I already know that it is easy to spoof things like the User Agent and the Referrer, but are there other headers that are more difficult to spoof? Maybe headers that are not commonly available in classes like .net's HttpWebRequest?
The other path that I looked at is maybe using the Encrypted View State to send a value to the browser that gets validated on the server side, though couldn't that value simply be scraped from the previous response and added as a post parameter to the next request?
Any help would be much appreciated, Cheers,
There is no easy way to differentiate because in the end, a post programitically looks the same to the server as a post by a user from the browser.
As mentioned, captcha's can be used to control posting but are not perfect (as it is very hard but not impossible for a computer to solve them). They also can annoy users.
Another route is only allowing authenticated users to post, but this can also still be done programatically.
If you want to get a good feel for how people are going to try to abuse your site, then you may want to look at http://seleniumhq.org/
This is very similar to the famous Halting Problem in computer science. See some more on the proof, and Alan Turing here: http://webcache.googleusercontent.com/search?q=cache:HZ7CMq6XAGwJ:www-inst.eecs.berkeley.edu/~cs70/fa06/lectures/computability/lec30.ps+alan+turing+infinite+loop+compiler&cd=1&hl=en&ct=clnk&gl=us
The most common way is using captcha's. Of course captcha's have their own issues (users don't really care for them) but they do make it much more difficult to programatically post data. Doesn't really help with GETs though you can force them to solve a captcha before delivering content.
Many ways to do this, like dynamically generated XHR requests that can only be made with human tasks.
Here's a great article on NP-Hard problems. I can see a huge possibility here: http://www.i-programmer.info/news/112-theory/3896-classic-nintendo-games-are-np-hard.html
One way: You could use some tricky JS to handle tokens on click. So your server issues token-id's to elements on the page during the backend render phase. Log these in a database or data file. Then, when users click around and submit, you can compare the id's sent via the onclick() function. There's plenty of ways around this, but you could apply some heuristics to determine if posts are too fast to be a human or not, that is, even if they scripted the hijacking of the token-ids and auto submitted, you could check that the time between click events appears automated. Signed up for a twitter account lately? They use passive human detection that while not 100% foolproof, it is slower and more difficult to break. Many if not all of the spam accounts there had to be human opened.
Another Way: http://areyouahuman.com/
As long as you are using encrypted methods verifying humanity without crappy CAPTCHA is possible.I mean, don't ignore your headers either. These are complimentary ways.
The key is to have enough complexity to make for an NP-Complete problem in terms of number of ways to solve the total number of problems is extraordinary. http://en.wikipedia.org/wiki/NP-complete
When the day comes when AI can solve multiple complex Human problems on their own, we will have other things to worry about than request tampering.
Another company doing interesting research is http://www.vouchsafe.com/play-games they actually use games designed to trick the RTT into training the RTT how to be more solvable by only humans!