We believe in a world where all information can seamlessly flow between your devices, services, and applications and you’re never directed to examine a webpage to get an answer to your question. This requires building a new kind of search, one that can see the entire web as structured information, rather than documents.
At Diffbot, we apply computer vision and natural language processing to the problem of structuring information. Located a block from the Stanford campus, Diffbot is the first startup incubated by Stanford StartX and funded by Sun Microsystem’s founder Andy Bechtolsheim and Earthlink founder Sky Dayton. We’re a small, but growing, team of world-class machine learning, natural language processing, and web search pioneers. Our APIs currently power many of the world’s largest internet sites.
Team of 8, with a mix of recent grads, serial entrepreneurs, and web veterans
Machine learning at web-scale: it’s not just a part of what we do, it *is* what we do
Massive datasets (both supervised and unsupervised) and real-time loads, with many classifiers that perform above human-level accuracy
Many proprietary and exotic technologies for visual rendering, statistical modeling, and web search
Sustainable revenue and growth plan
Well-funded with excellent pay and benefits
Beautiful environment located walking distance to Stanford campus, restaurants
Machine vision engineers at Diffbot are a resourceful bunch, always looking to squeeze every drop of signal out of a dataset. Unlike machine learning roles at other companies, our goalpost is to extract the unequivocal truth from a source document, not a subjective ranking, sentiment, or score. Because of this higher standard for accuracy, we’ve had to create new systems for handling training data and invent novel and performance-optimized algorithms.
Mix of object classification, scene understanding, and document analysis in a novel setting
Derive features by combining signals from disparate sources
Invent and test new ML techniques that can generalize to the web
Leverage near-infinite amounts of unsupervised training data on fast machines (40-core, 120GB ram, SSD, GPU)
To apply, send an e-mail to jobs(at)diffbot.com introducing yourself to the team. Let’s create the future of the web together.