MicroStrategy integrated with Hortonworks Data Platform (HDP) running on IBM Power Systems

[ad_1]

Steps for discovering and visualizing data in HDP on IBM Power Systems using MicroStrategy


Comments

MicroStrategy provides a business intelligence (BI) solution called MicroStrategy Desktop.
MicroStrategy Desktop provides self-service analytics with built-in tools and dashboards
that enable data exploration and visualization. MicroStrategy supports accessing data in
Hadoop environments. Validation testing was performed to verify MicroStrategy’s ability
to integrate with and visualize data specifically to Hortonworks Data Platform (HDP) on
IBM® POWER8® processor-based servers. This article provides an overview of the
validation tests that were completed.

Objectives

The key objectives for the validation testing of MicroStrategy were to:

  • Configure MicroStrategy to connect to HDP 2.6 running on an IBM POWER8 processor-based
    server.
  • Extract and visualize sample data from the Hadoop Distributed File System (HDFS) of HDP
    running on a POWER8 processor-based server.

Test environment

This section lists the high-level components of MicroStrategy and HDP used in the test
environment.

MicroStrategy

  • MicroStrategy Personal Edition 10.6.0
  • Hortonworks ODBC Driver for Apache Hive v2.1.5
  • A notebook running Microsoft Windows 7

Hortonworks Data Platform

  • HDP version 2.6
  • Red Hat Enterprise Linux version 7.2
  • Minimum resources: Eight virtual processors, 24 GB memory, 50 GB disk space
  • IBM PowerKVM™
  • IBM POWER8 processor-based server

Deployment architecture

The deployment architecture is quite simple. MicroStrategy and the Hortonworks ODBC driver
were installed and run on a Windows 7 system. HDP was installed and run on a POWER8 server.
MicroStrategy and the ODBC driver were configured to connect to HDP. Data in HDP was
accessed and visualized by MicroStrategy Desktop. Tests were run in a single-node HDP
environment and a multi-node HDP cluster.

Installation and configuration

This section covers the installation and configuration of a HDP cluster and MicroStrategy software.

Installing and configuring the HDP cluster

Here are the high-level steps to install and configure the HDP cluster:

  1. Follow the installation guide for HDP on Power Systems (see Resources) to install and configure the HDP cluster.
  2. Log in to the Ambari server and ensure that all the services are running.
  3. Monitor and manage the HDP cluster, Hadoop, and related services through Ambari.

Setting up test data and Hive tables

Download the MovieLens and driver test data, copy the data to HDFS, and create Hive
tables.

  1. Download the MovieLens data set from here (see the
    citation in Resources)
  2. Follow the instructions here to copy the
    MovieLens dataset data to HDFS and set up Hive external tables. Use hive user ID
    for the same.
  3. Download the driver data file from the Driver Behavior data file from here.
  4. Copy the driver data to HDFS.

     

    # su – hive
    # hadoop fs -mkdir -p /user/hive/dataset/drivers
    # hadoop fs -copyFromLocal /home/np/u0014213/Data/truck_event_text_partition.csv /user/hive/dataset/drivers
    # hadoop fs -copyFromLocal /home/np/u0014213/Data/drivers.csv /user/hive/dataset/drivers
    # hadoop fs -ls /user/hive/dataset/drivers
    Found 2 items
    -rw-r--r--  3 hive hdfs     2043 2017-05-21 06:30 /user/hive/dataset/drivers/drivers.csv
    -rw-r--r--  3 hive hdfs  2272077 2017-05-21 06:30 /user/hive/dataset/drivers/truck_event_text_partition.csv

  5. Create Hive tables for driver data.

     

    # su – hive
    # hive
    hive>create database trucks;
    hive> use trucks;
     
    hive> create table drivers
    (driverId int,
    name string,
    ssn bigint,
    location string,
    certified string,
    wageplan string)
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY ','
    STORED AS TEXTFILE
    TBLPROPERTIES("skip.header.line.count"="1");
     
    hive> create table truck_events
    (driverId int,
    truckId int,
    eventTime string,
    eventType string,
    longitude double,
    latitude double,
    eventKey string,
    correlationId bigint,
    driverName string,
    routeId int,
    routeName string)
    ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
    STORED AS TEXTFILE
    TBLPROPERTIES("skip.header.line.count"="1");
     
    hive> show tables;
    OK
    drivers
    truck_events

  6. Load the data into the tables from the files in HDFS.

     

    hive> LOAD DATA INPATH '/user/hive/dataset/drivers/truck_event_text_partition.csv' overwrite into
      table truck_events;
    hive> LOAD DATA INPATH '/user/hive/dataset/drivers/drivers.csv' overwrite into table drivers;

  7. Cross check the tables to ensure that the data is present by running queries on the
    tables.

Installing and configuring Hortonworks ODBC driver

Here are the steps to install and configure the ODBC driver:

  1. Download the Hortonworks ODBC driver on Windows 7 (see Resources for the download website).
  2. Install and configure the ODBC driver. Follow the instructions in the guide listed in
    the Resources section.

Installing and configuring MicroStrategy

Here are the steps to install and configure MicroStrategy:

  1. Go to the MicroStrategy download page (see Resources) to
    download Microstrategy Desktop on Windows 7.
  2. Follow the prompts to install it in the Windows 7 system.

Connecting HDP to MicroStrategy

Here are the steps to configure the connection between HDP and MicroStrategy.

  1. Launch the ODBC Administrator from Windows and add a data source for Hortonworks Hive as
    shown in Figure 1.

    Figure 1. Hortonworks Hive ODBC Driver setup

  2. In the Windows 7 system, launch MicroStrategy Desktop and configure the connection to
    HDP as shown in Figure 2.

    Figure 2. Main MicroStrategy Desktop

  3. Select the database and tables as shown in Figure 3.
    Figure 3. Selecting databases and tables

  4. Add the data sources, provide the Hive data source details, and enter the Hive database
    password as shown in Figure 4.

    Figure 4. Specifying the necessary values

  5. Select the Hive data bases and tables and load the data so the data is ingested into
    MicroStrategy from Hive as shown in Figure 5. The data is now loaded into MicroStrategy as
    displayed in Figure 6. Now you are ready to start analyzing.

    Figure 5. Loading the Hive data into MicroStrategy

    Figure 6. Preview of the data loaded into MicroStrategy

Visualization and analysis in MicroStrategy

Using MicroStrategy Desktop, select the columns of data for visualization and analysis.
Note that the data is now in memory and analysis is done on the data in memory.

Figures 7-9 show examples of analysis and visualization that were tested using truck_events
data from the Driver Behavior database.

Figure 7. MicroStrategy truck_events visualization example 1

Figure 8. MicroStrategy truck_events visualization example 2

Figure 9. MicroStrategy truck_events visualization example 3

Figures 10-12 show examples of analysis and visualization that were tested using movie
rating data from the MovieLens dataset.

Figure 10. MicroStrategy movie rating visualization example 1

Figure 11. MicroStrategy movie rating visualization example 2

Figure 12. MicroStrategy movie rating visualization example 4

Resources

Downloadable resources

static.content.url=http://www.ibm.com/developerworks/js/artrating/

SITE_ID=1

Zone=Linux

ArticleID=1055099

ArticleTitle=MicroStrategy integrated with Hortonworks Data Platform (HDP) running on IBM Power Systems

publish-date=12082017

[ad_2]

IBM Maintenance

Leave a Reply

Your email address will not be published.