BMMSoft UDAS: The World's First Unified Big Data Solution

BMMsoft EDMT® Server FAQ – Frequently Asked Questions and Background Information

EDMT® Solution FAQ - Frequently Asked Questions

1. What does EDMT® stand for?

EDMT® stands for “Emails, Documents, Multimedia and Transactions”. EDMT® represents any/all data types that can be stored in electronic format. EDMT® is the trademark of BMMsoft, Inc.

2. What is EDMT® Solution?

EDMT® Solution is a state-of-the-art HW+SW solution for real-time storing and real-time analysis of large volumes (100s of TB – multiple PB) of EDMT data. EDMT® Solution offers unique combination of large-scale Text and SQL analysis, combined with virtually unlimited storage ability for source files, emails, SMS/MMS (“BLOB”) and db records (transactions).

3. What is “Big Data” and how does EDMT® Solution play in that space?

”Big Data” is a term that describes “enormous amounts of structured and unstructured data” – possibly 10s-100s of TB, PetaBytes, Exabytes or more - that has to be stored and analyzed – at affordable price and using any search or analytic methodology”. EDMT® Solution is the pragmatic answer to that problem.

4. How is EDMT® Solution packaged and delivered?

EDMT® Solution comes in 7 models, ranging from high-end system with 15,360 cores and 40 PB to entry system with 4 cores and 36TB. Preferred and the only certified HW platform are HP DL980/Linux Servers or HP Integrity Servers and HP P2000 storage systems.

EDMT<sup>®</sup> Solution

5. How long is the Deployment time for EDMT® Solution and does it require change in existing DW and OLT systems?

Standard deployment of EDMT® Solution takes 1 week using “EDMT Deployment Services”. BMMsoft or SI (with over 1,000 consultants worldwide) will perform on-site deployment and connect EDMT® Solution to standard data sources – as agreed in SOW defined in Activity #9 of BCS EDMT Sales process”. Contrast this with vendors who offer multiple, non-integrated “point products” (typ. result of multiple acquisitions) who need long time to provide custom integrated solution that links those products, all at great expense, time (months) and never-ending burden of expensive maintenance of non-standard, custom product.
Some customers deploy EDMT® Solution side-by-side with their DW(i.e. Oracle, Netezza, IQ) or OLTP systems (i.e. Oracle, MS SQL Server, Sybase ASE etc.) in order to extend those DW and OLTP system with unstructured data and provide their DW and OLTP system the “Big Picture” of “Big Data” WITHOUT having to modify anything in existing systems (nobody would want to “flood” their existing DW (i.e. 5 TB) with 100s and 1,000s of TB of emails/SMS/files/multimedia – adding EDMT as a side-extender is much simpler, faster and NO RISK).

6. What is the difference between EDMT® Solution and a data warehouse (DW)?

DW stores structured data (approx. 20% of all corporate data), while EDMT® Solution stores all data types (“EDMT”)- which is 100% of corporate data. EDMT® Solution enables the storage and analysis of all data types so that business users can quickly and easily access and cross-analyze all data.

7. Does EDMT® Solution operate in Real-time and at high speed?

Yes, EDMT® Solution operates in real-time, so that stored data is immediately available for access and analysis after 0.5-2 sec after it has been submitted to EDMT® Ingestor. Top speed achieved using EDMT® Solution is 14 TB/hour, using DL 980.

8. HW and SW for EDMT® Solution?

EDMT® Solution is certified on HP DL980/Linux high-end servers. EDMT® Solution uses Sybase IQ as a DB running on HP DL980 because of its ability to perform SQL and Text analytics. Optionally, IQ can run on HP Integrity servers.

9. What is the support cost for EDMT® Solution?

Maintenance and Support for EDMT SW is 20% of the SW purchase price (common in the industry). HW support is under HP control.

10. Can BMMsoft SI sell non-HP HW into deals created by HP?

No, BMMsoft SI are not allowed to sell or promote HW other than HP into opportunities created by HP. In addition, EDMT® Solution is certified only on HP HW and BMMsoft will only generate license keys for EDMT SW and IQ that are specific to HW for each deal and BMMsoft will generate only licenses for HP HW. In addition, HP BCS Sales Rep will be involved in every step of the sales process.

11. How does EDMT® Solution compare to HP Vertica?

EDMT® Solution is the application that runs on top of a any certified database. Vertica is a analytic database, so EDMT® Solution and Vertica are completely different products. EDMT® Solution stores structured and unstructured data in DBMS to perform Text+SQL analysis+cross-analysis and extremely cost-effective storage of source data (emails, files, multimedia and DB records). Vertica focuses on BI SQL analytics (and does an excellent job), but does not offer full-text-search capabilities nor low-cost, multi-Petabyte storage for source data (emails/SMS, files, multimedia). EDMT® Solution has not been ported not certified with Vertica. Hadoop role is explained in question 14. EDMT® Solution does not replace Vertica – on contrary, it can be deployed side-by-side to extend Vertica with large amounts of unstructured data (including BLOBs).

12. How does EDMT® Solution compare to Autonomy?

Autonomy is an excellent text search engine, very popular in enterprise content search and with a wide variety of text-search applications (not unlike Google search engine). Autonomy does not offer large-scale SQL analytics, SQL-text cross-analytics nor integrated storage of source data (i.e. emails, files, multimedia or SQL records)- all of which are an integral part of EDMT® Solution. As such, EDMT® Solution can help Autonomy customers by extending Autonomy with EDMT® “backend”, thus enhancing Autonomy’s SQL/text capability, BLOB storage capability and enable HP sales reps to sell more BCS HW (DL980, Integrity and P2000) and consulting.

13. How does EDMT® play in “In-memory database” technology trend?

EDMT backend database can transparently switch between running “in-memory” or “partially in-memory”. That allows EDMT to take full advantage of ever-more-affordable-and-bigger RAM, while NOT limiting the database size to the amount of available RAM (“in-memory” databases are limited to the size of RAM or their performance degrades drastically as soon as the data size exceeds RAM size). Because EDMT® Solution is targeting “Big Data” it made sure that EDMT performance can be maximized by keeping most frequently accessed indexes metadata in RAM, while keeping less frequently accessed data (i.e. source files etc.) on fast disks.

14. EDMT® and Hadoop?

“Apache Hadoop is a software framework that supports data-intensive distributed applications .. It enables applications to work with thousands of nodes and petabytes of data” [Wiki]. In essence, Hadoop spreads and distributes data to many single nodes with Hadoop file system on them and can activate and schedule some (external, non-Hadoop) data processing applications (i.e. DB, text-search engine, painter etc.). For users who want to invest time to develop their own application (while still having to purchase 3rd part DB), Hadoop is a good tool. For users who want an “out-of-the-box” system that can be deployed in 1 week without any programming, while having high-speed “3-Million-channel processing power”, EDMT® Solution is the way to go. Needless to say, EDMT® Solution can be customized, extended or reconfigured if needed using it’s own “3-Million-channel Ingestor”. Optionally, EDMT® Solution can include Hadoop (or any other distributed tool or connector) if needed.

15. Why is HP DL980 a good fit for EDMT?

In October 2011, EDMT and DL 980 became the world’s fastest “big data” loader – it loaded over 14 TB of data per hour – faster than full rack of competitive solutions! What enabled this World Record with Big Data is the power and size of DL980 combined with EDMT’s optimization and certification on DL980. Ability to handle unpredictable workload of big data capture, parsing, ingest, indexing and searches is handled much better than using big number of small(er) servers. If large amount of data has to be parsed+ingested quickly, it will need to fit in RAM to minimize disk I/O and increase speed. However, without having large RAM, server will have to perform paging ? swapping – which is very slow process and will immediately reduce overall system speed by order(s) of magnitude. That is why large RAM size (up to 2 TB) and large number of cores (up to 80) in DL980 is a perfect match for “big data”.

16. Sales process and its phases?

There are 5 Phases of EDMT® sales cycle : For more details about the EDMT sales process see “HP BCS SERVER EDMT® Sales Process ”

Sequence

Activity

Phase 1
Screening

1. Together with your EDMT SWAT TEAM counterpart, identify potential prospects using “5 Hunting questions” for EDMT® Solution.
2. Together with your EDMT SWAT Team counterpart contact the prospects and ask “5 hunting questions” and show “EDMT Customer Presentation” if needed. If any of the questions is “yes”, together with your EDMT SWAT Team counterpart review and decide next steps
3. Together with your EDMT SWAT Team have a call/meeting with your prospect
4. After the joint conference call a go/no go decision will be made

Phase 2
Discovery

5. In-depth consultation/calls (remote or in-person) will be held with HP, the EDMT SWAT Team and the prospect to discover precise requirements and needs.

Phase 3
Close

6. Prospect requests proposal
7. Technical and service details defined, EDMT Solution sized and defined
8. Price determined
9. SOW precisely defined for 1 week of Deployment Services
10. (Opt.) additional consulting services (outside the Deployment Services) defined
11. 2 purchase orders are made by the customer: one for hardware and hardware services (HP BCS reps), the other for complete EDMT Software and EDMT software services

Phase 4
Deployment

12. Hardware shipped to SI
13 . SI configures hardware, BMMsoft installs software remotely and tests
14. Tested configuration shipped to customer
15. SI installs and provides OJT training for the customer
16. Customer attends formal training for self-administration or SI takes over support and administration
17. Check if customer is satisfied with the EDMT deployment delivered by BMMsoft SI. This action will set the stage for the phase 5

Phase 5
Upsell
( +50% per year )

1. Jointly with EDMT SWAT Team keep working on Upselling the customer and therefore create constant revenue stream because
(a) “big data” will keep growing fast (=more storage and servers to ingest it) and
(b) new analytics will be added and more users will require more servers. Phase 5 will generate many times more revenue than the initial sale (Phases 1-4). Average growth of properly managed EDMT Installation is 50% per year.

17. EDMT® Solution vs. (partial or “point”) competitors that handle unstructured or structured data?
(i.e. email archives, document management systems, file archives, SQL archives, eDiscovery front-end tools, Text search engines etc.)

There are many “point products” on the market, each solving one or two specific tasks (i.e. email archive, eDiscovery front-end, file archive etc.). Those “point” products are the result of technological evolution of HW and SW, as well data processing needs (this is a long story). Those products focus on one (and only one) data “space” – structured or unstructured. Those two spaces (structured and unstructured) are almost 100% separated, with different processing techniques, products, standards, skill sets and mind set as well as vastly different price points. They “talk” to different segments of the company: structured products talk mostly to business groups of the organization, while unstructured/text products talk to legal, archive, compliance and similar teams. EDMT® Solutions takes fresh and revolutionary approach to the problem of unified storage+analysis of structured and unstructured data – and has solved it in all aspects: we can store and analyze any data in almost any way, while meeting strict price restrictions of both camps. We allow structured tools used in BI (i.e. Business Objects or Microstrategy) to access EDMT® Solution and cross-correlate large volumes of “EDMT” data and to view unstructured data (both previously impossible).

18. EDMT® vs. Google Search Appliance - “GSA”- (sold by Dell) ?

Google Search Appliance is “Google –in-the-box” used to perform similar Google operations (spider+index+search) of files on servers behind corporate firewalls.

  1. GSA is text search engine only
  2. Very limited SQL storage and virtually no ability to do SQL analytics
  3. Does not store source files(“BLOB”), so GSA cannot offer data retention, collaboration or any kind of data protection or data export functionality because it can NOT guarantee that source data not be deleted or modified when needed.
  4. To index 1 million files, customer pays over $70,000. In comparison, EDMT® Solution can index 1 Million files/emails for under $100, or 300 x better price performance, while offering much higher capacity – typ. 100x higher. See the following URLs:
  5. Poor scalability of GSA: the largest GSA (GB 9009, 5RU) can index only 10 Million files. In contrast, the smallest EDMT® Solution “M” can index 1.5 Billion files (150x more) at lower price. EDMT® Solution has more than 100x better price/performance than GSA
  6. Most other text-search engines are likely comparable to GSA

 

19. Why is IQ important for EDMT®?

Sybase IQ offers excellent mixed Text+SQL analysis at large scale (certified to over 1 PB). Such type of workload is totally different than typical BI analytic workload. In addition, IQ can store “original files” (i.e. office documents, PDFs, images, audio, video, emails, SMS/MMS etc.) in their “native format and thus meet critical need for “original files”. Using IQ to store original files brings many advantages, particularly when number of files exceeds billions and trillions (i.e. SMS, email) as such large number of files can’t be handled by file systems (designed to handle 1,000s and millions of files, but not billions or trillions). IQ can compress files (using compression without data loss) and put BLOBs on a lower-cost storage to further reduce storage cost (similar to cost of tapes+drives), while offering instant (sub-second) access at high speed of exporting/viewing of “source data”. IQ’s ability to use high-capacity, low-cost SATA disks (currently at 4 TB per spindle) keeps disk cost, storage footprint and electricity cost at very low (tape-like) level.

20. EDMT® Solution vs. HANA?

TBD

21. How to position EDMT® vs. Oracle/IBM/Netezza/EMC/Teradata/MS?

TBD

22. EDMT® Solution Customization?

Provided by authorized and approved BMMsoft Systems Integrator.

23. EDMT® Consulting?

Provided by BMMsoft and/or selected and assigned BMMsoft Systems Integrator to a specific project.

24. Who are EDMT® System Integrators?

The list of authorized EDMT® System Integrators has been provided to HP. For each customer, BMMsoft will select optimal SI – based on project details, geography and skill set – in order to meet customer demands.

25. What makes EDMT® Solution unique and scalable?

EDMT® Solution major technical differentiators include:

  1. Designed to manage information of all data types ( and not only structured or unstructured data)
  2. Provides real-time alerting and notification of events referenced in any type of data
  3. high ingest speed (+ 1 terabyte/hour) enabling real-time data capture and response
  4. Automated mapping of structured and unstructured content ("metadata on the fly")
  5. Stores unified data in a single SQL database making unified search and analysis possible
  6. Extreme data compression reduces cost by 90%
  7. Energy efficient by design: Less resources = less energy = less CO2

 

26. What are EDMT® Solution’s main components?

EDMT® Solution is prepackaged, preinstalled and certified Hardware + Software solution (rather than a myriad of hardware and expensive customized programs). Main components of EDMT® Solutions are:

  1. The EDMT® APIs enable 3rd party applications (SQL, Text or both) to ingest and access data in EDMT® Solution.
  2. The EDMT® Ingest and Meta data Manager automates all of the unstructured data ingest processes and transforms this content to SQL-compatible structure by creating meta data tags as the data is ingested. Using parallelized ingest processes, EDMT® Ingest and Meta data Manager enables the ingest and transformation of unstructured data at speeds exceeding 1 terabyte per hour.
  3. The EDMT® Policy Manager automates the processes for managing all policies governing data/records classification, retention, access privileges and alerts
  4. EDMT® Information Access, Discovery and Analysis Services are a set of user interface methods and services for directly interrogating and analyzing the combined data managed by EDMT® Server. These services include an EDMT® Solution Administrator GUI, an API for integrating standard SQL query tools, an API for integrating web services and an API for integrating custom data analysis tools.

 

27. How cost-effective and “green” is the EDMT® Solution?

EDMT® Solution costs approximately 1/10th the cost of alternative approaches to archiving, accessing and analyzing structured and unstructured data. And, EDMT® Solution reduces energy consumption and CO2 emissions by as much as 90%

28. EDMT® Appliance has the following, fully integrated components

  1. Connectors to multiple Email servers (MS Exchange, Lotus, Gmail, HotMail, Yahoo, any RFC2822-compatible format) and in-flight smtp protocol
  2. Integrated file system spiders
  3. Real-time, multi-channel ETL for unstructured data (over 400 formats)
  4. Real-time ETL for structured data (custom)
  5. High-speed, massively parallel archiver and indexer for unstructured data
  6. High-speed, massively parallel archiver and indexer for structured data
  7. High-speed, massively parallel text engine
  8. High-speed, massively parallel SQL analytic engine
  9. Retention Manager
  10. Auto-classification module
  11. Collaboration Manager
  12. Browser-based Graphical User Interface (GUI)
  13. Administration Module
  14. SQL API for SQL tools
  15. Web Services for non-SQL tools
  16. Back-end data repository with advanced data encryption
  17. Monitoring, Diagnostics and Notification module
  18. Built-in Email Server
  19. Custom Web crawler
  20. eDiscovery Module
  21. Audit and Compliance Module
  22. Multi-site replicator (custom) with NonStopIQ Connector
  23. Mobile interface (smartphone)
  24. High-speed email exporter for recovery of MS Exchange
  25. High-speed file exporter for recovery of file systems or data streaming
  26. Email conversion Engine

29. Is BMMsoft member of HP AllianceONE?

Yes

30. How many sales of EDMT® Solutions per year can BMMsoft (with SIs) support?

over 400

31. How does HP Sales Rep engage EDMT® SWAT Team?

Contact BMMsoft Paul Krneta This email address is being protected from spambots. You need JavaScript enabled to view it. or Kathleen Naganuma This email address is being protected from spambots. You need JavaScript enabled to view it.