Computer Program to Spot Fake Reviews?
Acknowledgment to Cornell University ChronicleOnline
As you probably know, opinion and review sites like Yelp and TripAdvisor are littered with fake reviews. Review sites are regular targets for phony reviews – both positive reviews created by owners and managers and negative reviews to denigrate competitors.
Although many people claim they can spot fakes, recent research by Cornell University showed that in reality we are very poor at differentiating true from false reviews - human judgment proved no better than tossing a coin.
However, the same Cornell researchers have developed a computer program that is much better than humans at differentiating true from false reviews.
The work was reported in June 2011 at the Association for Computational Linguistics in Portland, Ore., by Claire Cardie, professor of computer science, Jeff Hancock, associate professor of communication, and graduate studentsYejin Choi and Myle Ott.
The team employed a group of people to deliberately write 400 false positive reviews of 20 Chicago hotels. These were compared with an equal number of genuine positive reviews for the same hotels.
Human judges – volunteer Cornell undergraduates – scored no better than chance in identifying fake reviews. They did not even agree on which reviews they thought were false, reinforcing that they were doing no better than chance.
According to the research team, humans suffer from a “truth bias,” and assume reviews to be true until they find evidence to the contrary. However, when people are trained at detecting deception they tend to become overly sceptical, swinging to far the other way and reporting deception too often, but still scoring no better than chance at telling true from false.
Computer analysis based on the text of known true and false reviews revealed, amongst other things, that truthful hotel reviews were more likely to use concrete words relating to the hotel, like “bathroom,” “check-in” or “price.” Fakes included more context setting words like “vacation,” “business trip” or “my husband.”
Using this and other text analyses as algorithms, the researchers trained a computer on a set of true and false reviews, then tested it against the rest of the database. By combining keyword analysis with the ways certain words were combined in pairs deceptive reviews were identified with 90% percent accuracy by the computer program.
Further research needs to be undertaken to see if a similar analysis can be equally successful at spotting true and false negative reviews.
This sort of software might be used by review sites as a “first-round filter,” Ott suggested. If, say, one particular hotel gets a lot of reviews that score as deceptive, the site should investigate further.
While this is the first study of its kind, and there’s a lot more to be done, I think our approach will eventually help review sites identify and eliminate these fraudulent reviews.
Unfortunately, once everyone knows what the computer program looks for, the fake review writer will also know how to trick it. Back to square one …