Untitled Document
  Home
Speakers
Sessions
  Schedule
Sponsors
Exhibitors
Untitled Document
2010 Europe Platinum Plus Sponsor

Untitled Document
2010 Europe Platinum Sponsor

Untitled Document
2010 Europe Gold Sponsor

Untitled Document
2010 Europe Bronze Sponsor

Untitled Document
2010 Europe Exhibitor

Untitled Document
2010 Europe Wireless Sponsor

Untitled Document
2009 East & West Platinum Sponsors

2009 East & West
Gold Sponsors

2009 East & West
Silver Sponsors

2009 East & West
Bronze Sponsors

2009 East & West
Exhibitors



Data Unification at Scale | @CloudExpo #BigData #DataLake #AI #Analytics
This term Data Unification is new in the Big Data lexicon, pushed by varieties of companies

This term Data Unification is new in the Big Data lexicon, pushed by varieties of companies such as Talend, 1010Data, and TamR. Data unification deals with the domain known as ETL (Extraction, Transformation, Loading), initiated during the 1990s when Data Warehousing was gaining relevance. ETL refers to the process of extracting data from inside or outside sources (multiple applications typically developed and supported by different vendors or hosted on separate hardware), transform it to fit operational needs (based on business rules), and load it into end target databases, more specifically, an operational data store, data mart, or a data warehouse. These are read-only databases for analytics. Initially the analytics was mostly retroactive (e.g. how many shoppers between age 25-35 bought this item between May and July?). This was like driving a car looking at the rear-view mirror. Then forward-looking analysis (called data mining) started to appear. Now business also demands "predictive analytics" and "streaming analytics".

During my IBM and Oracle days, the ETL in the first phase was left for outside companies to address. This was unglamorous work and key vendors were not that interested to solve this. This gave rise to many new players such as Informatica, Datastage, Talend and it became quite a thriving business. We also see many open-source ETL companies.

The ETL methodology consisted of: constructing a global schema in advance, for each local data source write a program to understand the source and map to the global schema, then write a script to transform, clean (homonym and synonym issues) and dedup (get rid of duplicates) it. Programs were set up to build the ETL pipeline. This process has matured over 20 years and is used today for data unification problems. The term MDM (Master Data Management) points to a master representation of all enterprise objects, to which everybody agrees to confirm.

In the world of Big Data, this approach is very inadequate. Why?

  • Data unification at scale is a very big deal. The schema-first approach works fine with retail data (sales transactions, not many data sources,..), but gets extremely hard with sources that can be hundreds or even thousands. This gets worse when you want to unify public data from the web with enterprise data.
  • Human labor to map each source to a master schema gets to be costly and excessive. Here machine learning is required and domain experts should be asked to augment where needed.
  • Real-time data unification of streaming data and analysis can not be handled by these solutions.

Another solution called "data lake" where you store disparate data in their native format, seems to address the "ingest" problem only. It tries to change the order of ETL to ELT (first load then transform). However it does not address the scale issues. The new world needs bottoms-up data unification (schema-last) in real-time or near real-time.

The typical data unification cycle can go like this - start with a few sources, try enriching the data with say X, see if it works, if you fail then loop back and try again. Use enrichment to improve and do everything automatically using machine learning and statistics. But iterate furiously. Ask for help when needed from domain experts. Otherwise the current approach of ETL or ELT can get very expensive.

  • LikeData Unification at scale
  • Comment
  • ShareShare Data Unification at scale



Read the original blog entry...

About Jnan Dash
Jnan Dash is Senior Advisor at EZShield Inc., Advisor at ScaleDB and Board Member at Compassites Software Solutions. He has lived in Silicon Valley since 1979. Formerly he was the Chief Strategy Officer (Consulting) at Curl Inc., before which he spent ten years at Oracle Corporation and was the Group Vice President, Systems Architecture and Technology till 2002. He was responsible for setting Oracle's core database and application server product directions and interacted with customers worldwide in translating future needs to product plans. Before that he spent 16 years at IBM. He blogs at http://jnandash.ulitzer.com.

Articles & Feature Stories
With more than 30 Kubernetes solutions in the marketplace, it's tempting to think Kubernetes and the vendor ecosystem has solved the problem of operationalizing containers at scale or of automatically managing the elasticity of the underlying infrastructure that these solutions need to be truly scalable. Far from it. There are at least six major pain points that companies experience when they try to deploy and run Kubernetes in their complex environments. In this presentation, the speaker will detail these pain points and explain how cloud can address them.
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-centric compute for the most data-intensive applications. Hyperconverged systems already in place can be revitalized with vendor-agnostic, PCIe-deployed, disaggregated approach to composable, maximizing the value of previous investments.
When building large, cloud-based applications that operate at a high scale, it's important to maintain a high availability and resilience to failures. In order to do that, you must be tolerant of failures, even in light of failures in other areas of your application. "Fly two mistakes high" is an old adage in the radio control airplane hobby. It means, fly high enough so that if you make a mistake, you can continue flying with room to still make mistakes. In his session at 18th Cloud Expo, Lee Atchison, Principal Cloud Architect and Advocate at New Relic, discussed how this same philosophy can be applied to highly scaled applications, and can dramatically increase your resilience to failure.
Machine learning has taken residence at our cities' cores and now we can finally have "smart cities." Cities are a collection of buildings made to provide the structure and safety necessary for people to function, create and survive. Buildings are a pool of ever-changing performance data from large automated systems such as heating and cooling to the people that live and work within them. Through machine learning, buildings can optimize performance, reduce costs, and improve occupant comfort by sharing information within the building and with outside city infrastructure via real time shared cloud capabilities.
As Cybric's Chief Technology Officer, Mike D. Kail is responsible for the strategic vision and technical direction of the platform. Prior to founding Cybric, Mike was Yahoo's CIO and SVP of Infrastructure, where he led the IT and Data Center functions for the company. He has more than 24 years of IT Operations experience with a focus on highly-scalable architectures.
CI/CD is conceptually straightforward, yet often technically intricate to implement since it requires time and opportunities to develop intimate understanding on not only DevOps processes and operations, but likely product integrations with multiple platforms. This session intends to bridge the gap by offering an intense learning experience while witnessing the processes and operations to build from zero to a simple, yet functional CI/CD pipeline integrated with Jenkins, Github, Docker and Azure.
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regulatory scrutiny and increasing consumer lack of trust in technology in general.
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a member of the Society of Information Management (SIM) Atlanta Chapter. She received a Business and Economics degree with a minor in Computer Science from St. Andrews Presbyterian University (Laurinburg, North Carolina). She resides in metro-Atlanta (Georgia).
Enterprises are striving to become digital businesses for differentiated innovation and customer-centricity. Traditionally, they focused on digitizing processes and paper workflow. To be a disruptor and compete against new players, they need to gain insight into business data and innovate at scale. Cloud and cognitive technologies can help them leverage hidden data in SAP/ERP systems to fuel their businesses to accelerate digital transformation success.
Containers and Kubernetes allow for code portability across on-premise VMs, bare metal, or multiple cloud provider environments. Yet, despite this portability promise, developers may include configuration and application definitions that constrain or even eliminate application portability. In this session we'll describe best practices for "configuration as code" in a Kubernetes environment. We will demonstrate how a properly constructed containerized app can be deployed to both Amazon and Azure using the Kublr platform, and how Kubernetes objects, such as persistent volumes, ingress rules, and services, can be used to abstract from the infrastructure.
Untitled Document
Cloud Expo - Cloud Looms Large on SYS-CON.TV



Cloud Expo 2009 Europe Opening Keynote by GoodData

In this presentation Roman Stanek, a technology visionary who has spent the past fifteen years building world-class technology companies, will talk about what it means to be 'born on the cloud.' Specifically Roman will share with delegates his thoughts on how to use cloud computing as a technical design center; how to take advantage of the economics of cloud computing in building and operating cloud services; how to dramatically change customer adoption; and how to plan for the technical and operational scale that cloud computing makes possible.

Good Data - Collaborative Analytics On Demand
This presentation will describe how Good Data moved from concept to reality, demonstrating the importance of architecting for the cloud: how to conceptualize, design and build applications when cloud computing is a given.

VMware - Building Cloud Infrastructures with VMware vSphere
vSphere 4 is the industry’s first cloud operating system, transforming datacenters into dramatically simplified environments to enable the next generation of flexible, reliable IT services. Combining VMware’s industry leading virtualization technology and experience, VMware vSphere delivers uncompromising control, with greater efficiency, while preserving customer choice.

Sun Microsystems - The Sun Cloud: Sun's Public Cloud Computing Service
Cloud Computing is empowering users like never before, giving them access to massive amounts of compute power and storage capacity, on- demand and in real time. This session will outline Sun's vision and strategy for cloud computing, highlighting the Sun Cloud - Sun's Public Cloud Service that leverages a broad range of Sun's innovative hardware and software technology and products. An overview presentation and practical demonstrations / demos will showcase how cloud-based compute and storage resources enable users to deploy applications quickly, easily and inexpensively..

The Time is Right for Enterprise Cloud Computing
During his keynote, Rich Marcello, Senior Vice President of Unisys, will discuss the latest technologies and approaches that help knock down these barriers, creating the opportunity for attendees to now consider cloud managed services as part of their data center journey to secure "IT as a Service".

Accelerating Innovation with Cloud Computing
Join Shelton Shugar, Senior Vice President of Cloud Computing at Yahoo! for a keynote elaborating on how Yahoo! and consumers benefit from Yahoo! Cloud Services and will describe Yahoo! Cloud Services and technologies.

CloudEXPO Stories
With more than 30 Kubernetes solutions in the marketplace, it's tempting to think Kubernetes and the vendor ecosystem has solved the problem of operationalizing containers at scale or of automatically managing the elasticity of the underlying infrastructure that these solutions need to be truly scalable. Far from it. There are at least six major pain points that companies experience when they try to deploy and run Kubernetes in their complex environments. In this presentation, the speaker will detail these pain points and explain how cloud can address them.
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-centric compute for the most data-intensive applications. Hyperconverged systems already in place can be revitalized with vendor-agnostic, PCIe-deployed, disaggregated approach to composable, maximizing the value of previous investments.
When building large, cloud-based applications that operate at a high scale, it's important to maintain a high availability and resilience to failures. In order to do that, you must be tolerant of failures, even in light of failures in other areas of your application. "Fly two mistakes high" is an old adage in the radio control airplane hobby. It means, fly high enough so that if you make a mistake, you can continue flying with room to still make mistakes. In his session at 18th Cloud Expo, Lee Atchison, Principal Cloud Architect and Advocate at New Relic, discussed how this same philosophy can be applied to highly scaled applications, and can dramatically increase your resilience to failure.
Machine learning has taken residence at our cities' cores and now we can finally have "smart cities." Cities are a collection of buildings made to provide the structure and safety necessary for people to function, create and survive. Buildings are a pool of ever-changing performance data from large automated systems such as heating and cooling to the people that live and work within them. Through machine learning, buildings can optimize performance, reduce costs, and improve occupant comfort by sharing information within the building and with outside city infrastructure via real time shared cloud capabilities.
As Cybric's Chief Technology Officer, Mike D. Kail is responsible for the strategic vision and technical direction of the platform. Prior to founding Cybric, Mike was Yahoo's CIO and SVP of Infrastructure, where he led the IT and Data Center functions for the company. He has more than 24 years of IT Operations experience with a focus on highly-scalable architectures.
Top Stories for Cloud Expo Europe

Kevin L Jackson launched the "Government Cloud Computing Journal" on Ulitzer. The online magazine offers stories and articles on the effective use of cloud computing technologies within the government domain. Kevin L. Jackson is a senior information technologist specializing in information technology solutions that meet critical Federal government operational requirements. Currently, he serves as Director, Business Development for Dataline, Inc., and editor of Government Cloud Computing e-zine. Kevin L. Jackson (right) with Cloud Computing Expo conference chair Jeremy Geelan before his presentation on Government Cloud Computing. About Ulitzer.com Initiating content coverage on any topic or launching a magazine at Ulitzer.com  is designed to be as easy as boiling an egg and doesn't take much longer. To become a Ulitzer author, anyone can fill out a simple author prof... (more)

Best Recent Articles on Cloud Computing & Big Data Topics
As we enter a new year, it is time to look back over the past year and resolve to improve upon it. In 2014, we will see more service providers resolve to add more personalization in enterprise technology. Below are seven predictions about what will drive this trend toward personalization.
IT organizations face a growing demand for faster innovation and new applications to support emerging opportunities in social, mobile, growth markets, Big Data analytics, mergers and acquisitions, strategic partnerships, and more. This is great news because it shows that IT continues to be a key stakeholder in delivering business service innovation. However, it also means that IT must deliver new innovation despite flat budgets, while maintaining existing services that grow more complex every day.
Cloud computing is transforming the way businesses think about and leverage technology. As a result, the general understanding of cloud computing has come a long way in a short time. However, there are still many misconceptions about what cloud computing is and what it can do for businesses that adopt this game-changing computing model. In this exclusive Q&A with Cloud Expo Conference Chair Jeremy Geelan, Rex Wang, Vice President of Product Marketing at Oracle, discusses and dispels some of the common myths about cloud computing that still exist today.
Despite the economy, cloud computing is doing well. Gartner estimates the cloud market will double by 2016 to $206 billion. The time for dabbling in the cloud is over! The 14th International Cloud Expo, co-located with 5th International Big Data Expo and 3rd International SDN Expo, to be held June 10-12, 2014, at the Javits Center in New York City, N.Y. announces that its Call for Papers is now open. Topics include all aspects of providing or using massively scalable IT-related capabilities as a service using Internet technologies (see suggested topics below). Cloud computing helps IT cut infrastructure costs while adding new features and services to grow core businesses. Clouds can help grow margins as costs are cut back but service offerings are expanded. Help plant your flag in the fast-expanding business opportunity that is The Cloud, Big Data and Software-Defined Networking: submit your speaking proposal today!
What do you get when you combine Big Data technologies….like Pig and Hive? A flying pig? No, you get a “Logical Data Warehouse.” In 2012, Infochimps (now CSC) leveraged its early use of stream processing, NoSQLs, and Hadoop to create a design pattern which combined real-time, ad-hoc, and batch analytics. This concept of combining the best-in-breed Big Data technologies will continue to advance across the industry until the entire legacy (and proprietary) data infrastructure stack will be replaced with a new (and open) one.
While unprecedented technological advances have been made in healthcare in areas such as genomics, digital imaging and Health Information Systems, access to this information has been not been easy for both the healthcare provider and the patient themselves. Regulatory compliance and controls, information lock-in in proprietary Electronic Health Record systems and security concerns have made it difficult to share data across health care providers.
Cloud Expo, Inc. has announced today that Vanessa Alvarez has been named conference chair of Cloud Expo® 2014. 14th International Cloud Expo will take place on June 10-12, 2014, at the Javits Center in New York City, New York, and 15th International Cloud Expo® will take place on November 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
12th International Cloud Expo, held on June 10–13, 2013 at the Javits Center in New York City, featured four content-packed days with a rich array of sessions about the business and technical value of cloud computing led by exceptional speakers from every sector of the cloud computing ecosystem. The Cloud Expo series is the fastest-growing Enterprise IT event in the past 10 years, devoted to every aspect of delivering massively scalable enterprise IT as a service.
Ulitzer.com announced "the World's 30 most influential Cloud bloggers," who collectively generated more than 24 million Ulitzer page views. Ulitzer's annual "most influential Cloud bloggers" list was announced at Cloud Expo, which drew more delegates than all other Cloud-related events put together worldwide. "The world's 50 most influential Cloud bloggers 2010" list will be announced at the Cloud Expo 2010 East, which will take place April 19-21, 2010, at the Jacob Javitz Convention Center, in New York City, with more than 5,000 expected to attend.
It's a simple fact that the better sales reps understand their prospects' intentions, preferences and pain points during calls, the more business they'll close. Each day, as your prospects interact with websites and social media platforms, their behavioral data profile is expanding. It's now possible to gain unprecedented insight into prospects' content preferences, product needs and budget. We hear a lot about how valuable Big Data is to sales and marketing teams. But data itself is only valuable when it's part of a bigger story, made visible in the right context.
Cloud Expo, Inc. has announced today that Larry Carvalho has been named Tech Chair of Cloud Expo® 2014. 14th International Cloud Expo will take place on June 10-12, 2014, at the Javits Center in New York City, New York, and 15th International Cloud Expo® will take place on November 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Everyone talks about a cloud-first or mobile-first strategy. It's the trend du jour, and for good reason as these innovative technologies have revolutionized an industry and made savvy companies a lot of money. But consider for a minute what's emerging with the Age of Context and the Internet of Things. Devices, interfaces, everyday objects are becoming endowed with computing smarts. This is creating an unprecedented focus on the Application Programming Interface (API) as developers seek to connect these devices and interfaces to create new supporting services and hybrids. I call this trend the move toward an API-first business model and strategy.
We live in a world that requires us to compete on our differential use of time and information, yet only a fraction of information workers today have access to the analytical capabilities they need to make better decisions. Now, with the advent of a new generation of embedded business intelligence (BI) platforms, cloud developers are disrupting the world of analytics. They are using these new BI platforms to inject more intelligence into the applications business people use every day. As a result, data-driven decision-making is finally on track to become the rule, not the exception.
Untitled Document
Register and Save Before June 20!
Only €295
for a Full “Golden Pass!”
Call 201.802.3020
or click here to Register
Early Bird Expires June 20th.


Prague Call For Papers NOW CLOSED
Submit
your speaking proposal for the
upcoming Cloud Expo West
in Santa Clara, CA
[November 1-4, 2010]


Sponsorship Opportunities
Please Call
201.802.3021
events (at) sys-con.com
SYS-CON's Cloud Expo, held each year in California, New York, Prague, Tokyo, and Hong Kong is the world’s leading Cloud event in its 4th year, larger than all other Cloud events put together. For sponsorship, exhibit opportunites and show prospectus, please contact Carmen Gonzalez, carmen (at) sys-con.com.

Cloud Expo TV from Times Square [Click To View]

Who Should Attend?
Senior Technologists including CIOs, CTOs, VPs of technology, IT directors and managers, network and storage managers, network engineers, enterprise architects, communications and networking specialists, directors of infrastructure Business Executives including CEOs, CMOs, CIOs, presidents, VPs, directors, business development; product and purchasing managers.


Download Cloud Computing Journal & Show Guide
Cloud Computing Journal
Download PDF
Cloud Expo Show Guide
Download PDF

The World's 30 Most influential Cloud Bloggers
Cloud Expo on Ulitzer
1
Dustin Amrhein 11 Kevin Hoffman 21 Greg O'Connor
2
Ezhil Babaraj 12 Alin Irimie 22 Maureen O'Gara
3
Tony Bishop 13 Kevin Jackson 23 Mark O'Neill
4
Reuven Cohen 14 Fuat Kircaali 24 Bill Roth
5
Ernest de Leon 15 David Linthicum 25 Ellen Rubin
6
David Dean 16 Lori MacVittie 26 John Savageau
7
Ray DePena 17 Bill McColl 27 Michael Sheehan
8
Dana Gardner 18 Paul Miller 28 Roman Stanek
9
John Gauntt 19 Louis Naugès 29 John Treadway
10
Jeremy Geelan 20 Greg Ness 30 Alan Williamson

Digital Transformation Blogs
As Cybric's Chief Technology Officer, Mike D. Kail is responsible for the strategic vision and technical direction of the platform. Prior to founding Cybric, Mike was Yahoo's CIO and SVP of Infrastructure, where he led the IT and Data Center functions for the company. He has more than 24 years of IT Operations experience with a focus on highly-scalable architectures.
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regulatory scrutiny and increasing consumer lack of trust in technology in general.
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a member of the Society of Information Management (SIM) Atlanta Chapter. She received a Business and Ec...
Untitled Document
Past SYS-CON Events
    Cloud Expo East
cloudcomputingexpo
2010east.sys-con.com

 
    Virtualization Expo East
virtualizationconference
2010east.sys-con.com
    Cloud Expo West
cloudcomputingexpo
2009west.sys-con.com

 
    Virtualization Expo West
virtualizationconference
2009west.sys-con.com
    GovIT Expo
govitexpo.com
 
    Cloud Expo Europe
cloudexpoeurope2009.sys-con.com
 

Cloud Expo 2010 Allstar Conference Faculty

SARWAL
Oracle

COFFEE
Salesforce

KHAN
Sybase

BISHOP
Adaptivity

MALCOLM
Abiquo

KHALIDI
Microsoft

RILEY
AWS

AZUA
IBM

BARRETO
Intel

CHAKRAVARTY
Novell

CRANDELL
RightScale

GAUVIN
Virtual Ark

GROSS
Unisys

SCHALK
Google

YEN
Juniper Networks

WILLOUGHBY
Compuware

What The Enterprise IT World Says About Cloud Expo
 
"We had extremely positive feedback from both customers and prospects that attended the show and saw live demos of NaviSite's enterprise cloud based services."
  –William Toll
Sr. Director, Marketing & Strategic Alliances
Navisite
 


 
"More and better leads than ever expected! I have 4-6 follow ups personally."
  –Richard Wellner
Chief Scientist
Univa UD
 


 
"Good crowd, good questions. The event looked very successful."
  –Simon Crosby
CTO
Citrix Systems
 


 
"Great conference and group of speakers, interesting timely announcements, and awesome networking."
  –Ricardo Sanchez
Software Architect
Myriadtech