Reinvent Yourself. And the World.


A colleague of mine said something great yesterday: “IBM is a place where you can re-invent yourself.” He’s right. Many of us were hired to do one job, and were given the opportunity by IBM to define our own roles, deciding ourselves what’s the best match between the company’s changing needs and our talents. There’s no lockstep management, and no demoralized employees wilting their lives away in cubicles.

At the Spark Technology Center we’ve been given the mandate to operate as a company within a company. You can feel the energy when you walk in: there’s the can’t miss designer area (follow the trail of Futurist-inspired paper airplanes), developers right next to them so they can share ideas, and comms and marketing people sitting—wherever we want, really. Nearest the Italian coffee maker, usually.

We’re actively working on new code for the Apache Spark open source project, and we’ve contributed and are continuing to work on SystemML, IBM’s declarative large-scale machine research project. We collect the best thinking on Spark, research into use cases, machine learning, and data analytics design from IBM AND the wider community, and put them before an international audience. We host data product-building workshops—with live bands and good wine—and our data designers hold pop-up hands-on design sessions—giving a visual “face” to Spark. We create new projects every month; among other things we’re working to improve Spark SQL performance focused on TPCDS and improved DataFrames and Datasets. Our leading thinkers on Spark, like Holden Karau, host “office hours” at coffee shops in SoMa to exchange ideas. And because of our strong executive sponsorship, we have the wind at our backs: it’s the exact opposite of what you’d expect from a large company—it’s phenomenally easy to get things done. You just have to find the time and people to do them.

To do that, we’re hiring. If you’re a talented developer, designer, or communications and marketing person, or if other people describe you as a “thought leader” on Spark (no one good describes themselves that way, do they?) – apply! You’ll have room to invent yourself and your career, and to make a real contribution to data analytics and Apache Spark, at a significant moment in the history of open source. It’s the opportunity to create something everyone is starting to imagine—you feel the pieces coming together from all fronts: design, math, code, business strategy—but hasn’t yet been made real.

When I started at IBM, someone said to me, “I can’t see you there. It’s so conservative!” There’s so much value in IBM’s conservative values—which I’m happy to say include trust, and innovation that matters. But if you’re imagining cubicles and wilting, please drop by. Find me on Twitter or email us at sparkibm@gmail.com. We promise you a colorful visit.


You Might Also Enjoy

Gidon Gershinsky
Gidon Gershinsky
19 days ago

How Alluxio is Accelerating Apache Spark Workloads

Alluxio is fast virtual storage for Big Data. Formerly known as Tachyon, it’s an open-source memory-centric virtual distributed storage system (yes, all that!), offering data access at memory speed and persistence to a reliable storage. This technology accelerates analytic workloads in certain scenarios, but doesn’t offer any performance benefits in other scenarios. The purpose of this blog is to... Read More

James Spyker
James Spyker
3 months ago

Streaming Transformations as Alternatives to ETL

The strategy of extracting, transforming and then loading data (ETL) to create a version of your data optimized for analytics has been around since the 1970s and its challenges are well understood. The time it takes to run an ETL job is dependent on the total data volume so that the time and resource costs rise as an enterprise’s data volume grows. The requirement for analytics databases to be mo... Read More