The Data Scientist Oath (Part 2)

Today, we are bombarded by data and information as FACTS. We accept what is in chart from any printed source as if it is a FACT and therefore true. Let me illustrate with sublime example.

half & half pie chart
Chart 1: A simple pie chart

If the title of this chart is “Households with Gourmet Cooks”, you might be influenced to run out an buy stock in a company that makes gourmet cooking equipment like Middleby Corporation the makers of Viking kitchen equipment. If the title of this chart is “Households with Gourmet Cooks (sample size = 2)” or “Households with Gourmet Cooks (Std. Dev. = N/A), you’d probably not. In fact, you’d probably wonder why I built the chart, and yet we frequently fall into this trap and don’t even ask for the transparency. The data scientist presenting his data should always tell you how they arrived at their conclusion.

Unfortunately it happens almost everyday in trusted sources ranging from news magazines, newspapers, documents at work, and in every aspect of the Internet (social, web sites, e-mail, etc.). The Internet is huge force multiplier, which enables one zealot to look like an entire movement and their arguments look like widely accepted FACTS.

 

 

There are even more subtle ways of influencing you especially with cultural and emotional cues. If you look at the charts below, what happens if the title for both charts which are identical is “Evil is winning the war over Good.”

In the first chart, you might initially draw that conclusion that evil is wining big time. We assume red represents evil since red is the color of the devil in the Western world. We react to the color and not the fact fact that evil is a 2% green slice. Plus the text in the legend is small and hard to read. In the second chart, you’d look at it and think the author is an idiot since the evil slice is clearly a tiny 2%. How information is represented is almost as important as the source data and methods used to turn it into information. Even good information can be displayed poorly.

PT Barnum said “you can’t fool all the people all the time, but you can fool all of the people some of the time” and I agree we all do get fooled occasionally. It is each individual’s responsibility to consume data carefully and consider the source and how the information is being displayed to minimize the “some of the time” to almost always never.

It is the responsibility of the data scientist to try to present the information in good faith, transparently, and with as little bias as possible. Stan Lee via Spiderman Comics said via wise Uncle Ben Parker “that with great power comes great responsibility.” If you work with data and publish it then understand the potential influence and power over people’s opinions, thoughts, feelings, and potential actions and use it wisely.

Start with a simple Data Scientist oath or code. “As a Data Scientist, I will understand the veracity and validity of my data and its sources, and I will clearly, transparently and with minimal cultural bias display the results so the end consumer can make valid conclusions.”

It is that simple. How cool is that you get to side with Stan Lee and all his comic book heroes and become a real hero by making every effort to represent the truth as plainly and obviously with unflinching transparency as humanly possible.

 

The Data Scientist Oath (Part 1)

I believe in the truth. Truth is in the eye of the beholder. When the beholder is a single person, or very small part of the population that has an idea that group will try to influence the rest of the population it is correct. Almost every conversation is a negotiation of what should be the dominant truth.

“There are three kinds of lies: lies, damned lies, and statistics [Data Scientist Output]”Benjamin Disraeli.

Today, we employ data scientist to distill data and facts into information. I believe we should hold data scientist to higher standard than most people. It is their sworn duty to ensure that they understand and make veracity of the data and their conclusions 100% transparent.

4-Vs-of-big-data

While Veracity is just one of the 4-V’s of Big Data, it is the most critical element. Up until now, we assumed data was gathered through a scientific process, using a scientific instrument, guided by a scientific review, and driving to scientific conclusion of a hypothesis. Unfortunately, today, it is an incorrect assumption.

We must understand that all data must be considered big data. I realize that technically all data is NOT big data, but as we shift from using data only in science to using it in every aspect of our lives, we must now treat it all as big data.

Part one of the Data Scientist Oath is about the input of data into the model. It is critical that the Data Scientist always utilizes the highest quality of data possible for his model from known source so that he can treat the veracity of the data correctly and be transparent to his clients in the final products.

In the next part, I’ll address the output side of the oath with some more simple examples that clearly illustrate how easy facts can be distorted. I’ll conclude with a summary and some next steps towards a Data Scientist Oath.

 

Why do I blog (and write stupid, long winded emails)

19705415_7b68856a97_o
The drive to communicate in writing. Why we still tolerate and love mail.

Because I think, I write blogs (and write stupid, long winded emails). Honestly, it is how I think or more precisely how I organize. The harder the activity without being overbearing, it forces me to organize my thoughts. Talking is the least of the offenses, so if I’m droning on like sewing machine, I’m working out a problem with you. If I’m in the shower, its my senses that are filtering my thoughts (water temp, standing, shaving my head (I’m bald damn it!), wondering where the water goes, if that fly I flushed when I was 12 has landed in nuclear waste facility and is now coming back via the drain to kill me, etc.). Maybe even while driving with me. I’ve heard Bill Clinton talks with his hands even when he drives. Strongly suggest Uber if you are with Bill.

On standing – Standing is remarkably tough if you look at as biologist so it counts as non-demanding filter. Ever tried standing and thinking and suddenly notice you’ve walked to the grocery store. Not walking; that is a reaction. Douglas Adams (author, not sane person, Hitchhikers Guide to Galaxy, et. al.) talks about flying as the art of throwing yourself at the ground and missing. So the better word for walking is “a reaction for not falling while traversing a plane (level space) in a 3 dimensional world with Newtonian gravity”. Babies start on their toes and try to go fast to overcome forward inertia. My brother was very fast because he always walked on his toes. It also is genetic as his daughter does, too.

I LOVE you. Oh don’t go all homophobic on me, please! And since it in Magenta you don’t believe me, but it really is my heterosexual brain messing with you, really, I think, I mean it is; maybe its only my brain that is heterosexual, Oh shit!, maybe; grunt “gun”; grunt “football”, say “lumberjack”, cry “man”, sing “show tunes”, OH shit!; never mind. I mean I really, really like you, like you, had friends when you were in grade school and not like Facebook which one of the more insulting but viral trends of our generation and maybe the millennium but I digress and it is Mark Andreas fault who begat Mosaic who begat the WWW (world wide web) who begat Mark Zuckerberg after about sea of begetting (plus you learn stuff, more later, beget) and digress more and it is still Facebook’s fault and only because I really like the word begat and found out it came from the word beget (this is now later) which totally cool, wonderful and a Sesame Street moment and I’m Ernie and he’s gay (SHIT!).

Yeah, I’m definitely staring to not like hate like and fine (but that was for extra points since my wife says FINE and THE F-word are the same words spelled differently and lots more on that later). Fabulous is a good F-word. And so what is it about sex. Freud “shut up,” you are sick!

 

b3d-814151_640
Tornado breath and lightening brain or vice versa.

Back to the other side of my brain, writing helps me slow the output so I can organize my thoughts. I used to think, it slowed my thoughts down or sharpened them, but the truth is they come at me like summer thunder storm; fast, heavy, loud, dangerous, and then not at all. When I can contain them like lightening in a bottle, I’m an utter genius. When not, lets just say I get offered pink slips, my friends hate me, and my even close friends buy me breakfast to start my day over including strong adult beverages.

Back to email [in case you thought I forgot], if you are getting my email, it means we are friends and I respect you even if I don’t like you and I want your opinion and I’m alive, thinking, filtering, and just maybe I think you’d be interested. Hey, you are actually being helpful. And I may like you or really like you (Damn!). And from my view, I’d rather you respected me than like me. It is less painful in every musical I’ve ever seen on Broadway. I have friends and their OK! (here we go again with Mark Zuckerberg and Facebook, damn). In this decade it is the equivalent of hanging out on the corner, the barber shop or the market or dare my highly heterosexual brain think it, the beauty parlor. This is big, harry, scary stuff, like a Bridget Bardot and Bouffants (here we go again!).

Why I wrote this blog entry. First, I’ve always been fascinated with how people think. Second, I love humor and I thought if 5 miserable people in NYC could be funny in a show about nothing, then this is definitely funny. But finally, and everything after the but is the real truth, it is Friday and I don’t want to do my EOW work bureaucratic administrative paperwork on the laptop. In short, I’m procrastinating and dragging you with me.

And for those of you who are my friends and even family and read my blog, you can stop reading now because you know enough about me that my apology for sexual, gender, race, ethnicity, discrimination, innuendo; and any harm that may have been done to dogs, cats, gerbils, hamsters, amoebas, pine cones, mushrooms, small furry creatures from Alpha Centauri, etc.; real or imagined, I beg your forgiveness in perpetuity since this all in jest; (with my high squeaky voice) maybe

 

Career 2.0 @ IBM

I’m still at IBM for 2 major reasons. First, I like and respect the majority of the people I work with at IBM. I think some even like me. Second, the ability to be an Intrapreneurer. I’m free to drive my career anywhere I choose within IBM. Recognizing when to change can be difficult especially if you are successful and comfortable in your current role. For many years, I was VERY comfortable in my role even as a made Distinguished Engineer (DE) in 2010.

Suddenly, in 2014, I thought I was going to leave IBM. In 2010, I had made IBM Distinguished Engineer (an IBM executive via a technical path) and had helped a few others achieve the executive rank of DE. What was left? Now in 2014, I realized I required a new challenge. My first thought was go run a mid-tier, highly technical consulting practice probably around SAP. After all, I had 17 years in the IBM SAP practice and almost 22 years SAP experience. The other option was become technical leader in a large, top 5 consulting firm for SAP. Eventually, I chose to stay at IBM and here is why.

I knew I did not want to go backwards. I did talk to a lot of firms, but all of them wanted me to go back back to simply selling work. When I looked most of these firms, they clearly were not getting the CAMS, Agile, DevOps, Cognitive message. They had pieces, but had not fully embraced it and I know it will be a tidal wave that will overwhelm them. I didn’t want to drown in that wave especially if I was just getting started at a new firm.

I do have a great network at IBM. While a lot of it was around SAP – a lot was not. I started reaching out to my mentors. IBM really encourages mentoring and something I believe in (Achieving your personal destiny through mentoring). Not so much a sign on the dotted line type of program, but an informal network of people whom you trust and can call upon for an honest opinion. I’m fortunate to have IBM Fellows, IBM Distinguished Engineers, Partners, Associate Partners, and some super smart, hard working, honest techies as part of my network. A few of these individuals I actually mentor, but at the same time they mentor me even if they are not aware. Almost all of them came back with same message more or less – “why are you constraining yourself to just SAP?”

Suddenly, I realized that IBM has a lot of very cool, cutting edge, exciting things happening. IBM has announced a new data scientist division, the cognitive computing age which takes AI to a new level, investing in Internet-of-Things (IoT), IBM cloud, and Bluemix to name a few key areas. Why not get involved in one of these?

The problem was me. I had to be honest with myself – I was scared. Why all of these others believed I could easily learn new areas, I wasn’t so sure. Maybe I had run out of capacity. Maybe I’d fail or worse, prove myself to be stupid. Maybe I had simply learned a lot a long time ago and was riding my past success. We all know executives who had one giant success a decade ago or more and is still standing on that one success and never has another one.

I decided to believe my mentors and believe in myself. Worse case, I could leave later. On September 1, 2015 (A new bigger palette) I moved to a new role settling in on the title of “IBM Technology Strategy Executive”. After 9+ months, a few 100K’s airline miles, and a lot of great engagements, I’m very happy with my decision to stay an IBMer.

 

Giving generously and effectively

The citizens of the USA are generous in giving charity – just make sure it counts because not all charities are equal. Some very legitimate sounding ones are a complete rip-off only making the officers of the charity rich and doing nothing, or nearly nothing, for the targeted cause. If you look at the National Center for Charitable Statistics, we give an amazing $228.93 billion of which 72% is by individuals. Overall, we contribute between 2.4 – 5.9% of our income. Unfortunately, the lower end of the range is from higher end of the professional salaries. I guess in route to becoming truly rich or becoming frustrated in not yet being rich, donations become a lower priority.

I strongly suggest you take a look at the ratings on a site like Charity Navigator. It does not rank charities, but rates them as 4-star through 0-star in terms of effectiveness and does not discriminate by cause. Whether you chose to give to Carolina Flood Relief, Breast Cancer, Religious organizations, Animals, etc., they can help your dollars reach the right audience. The American people certainly fortunate and for most of us, it is a land of opportunity. Yet for many, both in this country and around the world, it is not so and they could benefit from your help and a fraction of your resources. Please consider and give generously and wisely.

As a matter of disclosure, I do give Charity Navigator a small annual donation which my employer, IBM, matches.

IBM Innovation – A new bigger palette

My job is now IBM innovation as a SPEED Proof of Concept (POC) CTO. After 22 years a daily focus on SAP products starting with SAP Basis Training in 1993 on R/3 2.1 and making the SAP technology real for clients, I am now part of a team to accelerate IBM innovation. In the 22 years, of which 17 were with IBM’s SAP practice, SAP has done marvelous job of expanding from one black crayon (R/3) to a second blue crayon (BW) and then building up to riot of colored crayons. It was a wonderful journey of learning, great teams, wonderful adventures, many friends, and successful clients.

Now, as a SPEED POC CTO, I have not only crayons but every artistic media possible to help clients’ realize their digital enterprise via innovation. My job will be to pull together every element, those IBM has to offer, those in the market-place including SAP, those that the client possess, and the brain power in IBM to solve client conundrums with innovative capabilities. It is IBM innovation delivered with speed, velocity, and acceleration to empower our clients is critical. It is so critical that SPEED is not an acronym, but the name, the way, and the mode we work – hence an IBM SPEED CTO.

The SPEED POC team has already had some great successes. I hope to bring my knowledge of technology, architecture, cloud, analytics, mobile, social, security (CAMSS) and applications to the team. More importantly I want to bring what my clients have taught me, that cool, new or innovative is not important in of itself. It is only when cool, new and innovative opens new business capability that drives new business does it matter. I have 3 simple rules I follow when working with clients that I will continue to use.

  1. Understand the client’s real problem and why it is important
  2. Clearly, concisely, describe the proposed solution to the client’s problem
  3. Articulate why IBM is the best choice to deliver the solution

There is no doubt that joining the digital enterprise ecosystem is a critical theme (see last blog on S/4 and digital ecosystem) for IBM’s clients. I’m looking forward to helping IBM drive it with SPEED as a SPEED POC CTO.

So this blog will shift away from a SAP-centric focus to an IBM-centric focus on technology, but the technology has only been a portion of the discussion in this blog. I hope you choose to follow. I promise to be just a forthright as before with you. Thanks.

PS: How I got here Career 2.0 @ IBM.

SAP S/4 HANA must be the Digital Enterprise Ecosystem Hub to succeed

120px-Crystal_Project_webSAP has the greatest opportunity to succeed or fail with S/4 HANA. The success or failure will be based around the ability to easily, simply, quickly and cheaply integrate into each company’s Digital Enterprise Ecosystem. S/4 HANA’s core value today is about running your transactions (OLTP) on an analytics (OLAP) system enabling real time analytics on the data in your SAP system. The real-time analytics capability is not enough to take the risk of going to S/4 HANA. SAP must go far beyond its comfort zone of building a great business system to building in an unparalleled ability to participate in the Digital Enterprise Ecosystem directly to and from S/4 HANA.

The value and the power of the Digital Enterprise Ecosystem can not be ignored and is becoming well understood by companies today. The following quote is just one illustration.

In 1990, the top three automakers in Detroit had among them nominal revenues of $250 billion, a market capitalization of $36 billion, and 1.2 million employees. The top three companies in Silicon Valley in 2014 had nominal revenues of $247 billion, a market capitalization of over $1 trillion, and only 137,000 employees. From McKinsey – Competition at the digital edge: ‘Hyperscale’ businesses.

SAP can not afford to play favorites in the Digital Enterprise Ecosystem. SAP claims openness, and in many cases they do adopt generic open standards, but there has always been a better, easier, simpler, faster and far cheaper way to integrate to other SAP products. In today’s API driven world, this is unacceptable. Being the hub of the Digital Enterprise Ecosystem means simple, easy, native, real-time API driven integration to every major SaaS system.

SAP needs to start with most recognized SaaS solution on the planet, SalesForce. Next they need to hook up Workday their dreaded rival to SuccessFactors and then they should just keep going down the list based on what SaaS solutions SAP clients are using with no prejudice. SAP execs should go ask your clients for their wish list and put it in S/4 HANA immediately. Certainly they should enable their own excellent SaaS products like SuccessFactors. While this strategy may be non-intuitive in the pre-ecosystem world, it is THE requirement in the new one.

What happens if SAP does not open S/4 HANA to the Digital Enterprise Ecosystem? The S/4 HANA will succeed as a modest upgrade. Companies will have to find overwhelming new, required processes in S/4 HANA (Simple Finance, Simple Logistics, …). At this point, every process is now being reevaluate and potentially rethought. The most likely scenario is migrating that process to a more agile SaaS solution. SaaS enables participating in the Digital Enterprise Ecosystem and adopts to the inevitable changes in business and technology. The changes like mobile, new currencies, new laws, rise texting, IaaS, etc. produce more gain than highly optimized process system than the more rigid on premise solutions like SAP ERP.

Simply put, if SAP S/4 HANA doesn’t become become 100% open to all SaaS players, companies will continue to tear apart the monolith ERP system and push its functions out to much more nimble SaaS solutions. The existing ERP systems will be put on life-support, made highly stable, and programmed around until they become useless and atrophied (see McKinsey’s A two-speed IT architecture for the digital enterprise).

SAP is in a great spot as companies are recognizing the power of the Digital Enterprise Ecosystem. It is my sincere wish that SAP recognizes they need to be the premier player in that Ecosystem and become the hub of their client’s Digital Enterprise. After 22+ years working with SAP, I know they have the talent and I know the companies that have believed in SAP deserve it.