t’s no secret that Hadoop and Apache Spark are the hottest technologies in big data, but what’s less often remarked upon is that they’re both open-source. Mike Tuchen, a former Microsoft executive who is now CEO of big-data vendor Talend, thinks that’s no coincidence. “We’re seeing a changing of the guard,” he said. “We expect […]
Month: February 2016
Getting Started with Spark (in Python)
oop is the standard tool for distributed computing across really large data sets and is the reason why you see “Big Data” on advertisements as you walk through the airport. It has become an operating system for Big Data, providing a rich ecosystem of tools and techniques that allow you to use a large cluster […]