ETEP - Electronic Trade Efficiency Program

On this page:
The ETEP software
The ETEP Design Principles
The ETEP System Architecture
The Client
The Server
Information Model
Information Distribution Model
Database M2
The Engine
Data Model
Client API's
Query Language
Replication
Platform Requirements
Other links:
ETEP User Reference Card
Technical White Paper

The ETEP Software

ETEP, the Electronic Trade Efficiency Program, is a scalable distributed architecture for disseminating and managing business contact and trading information. ETEP is a complete solution for global trade. ETEP maintains a global distributed database of business contacts and Electronic Trade Opportunities, ETOs. The database is maintained in a network of Unix based ETEP servers and can be accessed and updated through a Windows based ETEP client application. Parts of the network's services are available though WWW as HTML pages.

We set out to design a backbone information system for real- time, on-line world trade. This mission required the basic design to meet these criteria:

High Performance.
Distributed Architecture.
Equal and Transparent Access to Information.
Scalability.
Fault Tolerance.
Ease of Administration
Speed of Development and Rapid Adaptation to Changes in Requirements.

The ETEP Design Principles

The underlying technology is widely applicable to tasks requiring instant network-wide access to fast changing distributed information. The ETEP architecture offers the following advantages:

Distribution
Global Location Consciousness
Replication
Caching
Dynamic Data Dictionary
Scalability

The ETEP System Architecture

It is designed to be an open, scalable system for on-line access to large volumes of constantly changing business data. ETEP is a framework where the users themselves feed as well as retrieve the data. The operator of an ETEP site can customise billing of services and access to the trading information.

The Client

The ETEP client software has enhanced functionality compared to WWW-gate with a browser. Industry specific modifications can be done effectively just by downloading the presentation object over the network to the client without changing the program. The client can be run also in off-line mode. Analyses can be made to local database. The global data directory, GDD, must first be fetched from the server. Exporting information to desktop applications such as Excel and Word is possible. Email without separate accounts is built in.

ETEP services are available through a Windows-based client application as well as through WWW. The client application is a generic forms system that has a local database that holds all objects the client has either stored into or retrieved from the ETEP network. The data is managed as a collection of related objects and the user interface allows navigating objects via references in hypertext-style. All object classes and related forms layouts are downloaded from the ETEP network. New classes, forms, etc. are automatically downloaded as they are released into the ETEP network. The client is also capable of browsing through its local store of objects without connecting to an ETEP server.

The Server

The ETEP server software consists of two main parts. First part is the front end which handles the communication between the database and ETEP client software. Second part is the WWW-gate which allows user to access the database with WWW-browser. WWW-gate includes the functions for data entry and inquiries of company and offer records. Efficient inquiry system is based on intelligent product/area/business -matrices and global data directories. WWW-gate converts inquired database information dynamically to HTML pages. Linking to the home pages of the subscribers is supported.

The software combines a high performance database and the ETEP-specific distribution, caching and querying logic. New sites are easy to add and most configuration information is downloaded from the network. The support, installation and administration services are provided over the Internet.

Information Model

ETEP's database holds the following types of objects:

Classifications of Products, Areas and Businesses
Network Topology Information - Servers and Their Area of Responsibility
Business Contact Information
ETO Information
User Account Information
Count of ETO's and Business Contacts for Each Area / Product Category Combination
The ETO and business contacts are classified by product type or business field and geographical area. Additionally ETO's are indexed by time of arrival and owner of the ETO, typically the vendor. The geographical areas and products form a tree. ETO and companies can be retrieved according to both, e.g. `electronics in China', PC peripherals in Taiwan', `Foodstuffs in Eastern Europe'. Any additional criteria can be specified on price, date of the ETO etc.

To facilitate the trade match process ETEP keeps a `Global Data Directory', the set of real time counts of ETO's or businesses in each area / category pair. This means that the count of offers in each category/area combination is kept up to date. This is an extremely powerful tool for getting the `big picture' of the situation. This is called the Global Data Directory or GDD.

The GDD is displayed as a matrix with areas on one axis and product categories and business fields on the other. In each cell is the number of sale or purchase offers or business contacts available in that particular area/category combination. Clinking on any cell of the GDD matrix shows a more detailed division of area and product categories.

The GDD matrix is especially useful tool for interactive use. It is available through WWW and ETEP/Desktop.

Information Distribution Model

Business Contact, ETO and GDD information have to be equally accessible from all nodes of the network in real time. Complete replication of data between nodes is however not a practical possibility.

ETO and business contact data can be distributed among multiple ETEP server sites. The geographical area, business field and product type form a natural basis for distributing information among servers.

We can think of ETO's as dots positioned on squares in a checkerboard with products on one axis and locations on the other. The division of both axes is hierarchical, with finer and finer categorisation on successive levels. Each ETO has a well-defined place on this plane. The total count of ETO's in each square of the matrix is what the GDD shows.

The network topology database is replicated on all nodes of an ETEP network. This database specifies which nodes hold data belonging to which product classes and geographical areas. Therefore the system can always direct queries to the appropriate place and the entire network is at the fingertips of the user independently of the mode of access, WWW.

It is possible to replicate data between ETEP nodes. The network topology database itself is an example of replication. All nodes of the ETEP network know if replicated copies of data in a particular product/area combination exist. If so, ETEP can automatically redirect the query to the nearest source of the information.

There is however a difference between replicated copies and the `original' or primary copy of an ETO. This is required to be able to support commercial transactions that modify the ETO's involved.

This is needed in order to scale the system from an electronic display of offer and demand into an actual trading place.

Database M2

The Engine

Clustered Index - The index is a B tree of 4K pages. The rows are stored on index pages physically following the primary key's index entry. The programmer has the choice to cluster rows of different tables by the value of the primary key or by the table. Each non-primary key is followed by the primary key parts so as to reference the actual row. Clustering may be separately specified for each index. Splitting of index pages is optimised to produce compact pages for serial inserts. BLOB's not fitting on the index page are stored as linked lists of pages referenced from the row.

Transaction Control - The basic unit of concurrency control. Normal transaction control is based on page level read/write locking. The set of written pages is maintained as an after image delta over the before image.

Non-Locking Read and Checkpoint - It is possible to start a clean, repeatable non-locking read at any time. Write transactions committing after the read is started will form a delta not visible in the clean read-only space. Several such read transactions can be in progress simultaneously, each seeing the state of the database as of the last transaction committed before starting the read. - The database has a special read-only state called the checkpoint. All pages `below' the checkpoint form a read-only image. Transactions modifying pages `under' the checkpoint make copies of the updated pages. Thus a log needs to be kept for only updates committed since the last checkpoint. A new checkpoint can be made at any time without interfering with normal operation.

File Allocation - The database is maintained as a single file under the native file system. The file is allocated in configurable size chunks to maintain locality. A separate file is used for the low. A database can be divided over several files on separate devices for am address space of 4G pages, i.e. 44 bits.

Transaction Logging - When a transaction is committed it is possible to specify how it will be logged. Normally, a database keeps a log of updates more recent than the last checkpoint for use in roll-forward recovery. It is also possible to run without logging and to write transactions into application-specific logs used for replicating.

Multithreaded Operation - The database engine is designed to take full advantage of multithreading and SMP. Only the commit operations are serialised and disk I/O may simultaneously proceed on multiple threads.

Data Model

The data model is relational with object oriented extensions. The design goal has been to give the developer relational modeling with the added expressive power of inheritance, run- time typing and late binding.

The database is structured in tables, columns and indices, just as any relational database. One index must be unique and is designated as the primary key with which the dependent part of the row is stored.

The notion of a subtable introduces inheritance. You may add columns or new keys to a subtable. A SELECT from a given table will also retrieve matching rows of subtables. A special column name will retrieve the entire row of a table as an object. This object can then be opened by the application.

Object ID's are supported as a separate data type and a separate operation allow dereferencing object ID's in queries. An API allows retrieving an object regardless of class ((table) given the ID. A similar API allows retrieving a row given a representation of the primary key as a single string.

For an object to be accessible by ID one of the table's keys must be designated as the ID key. This may or may not be the primary key.

All columns are tagged by data type and length. Using an appropriate API an application can switch on a column's type.

When a table is altered a new copy of the schema entry is made. Rows identify their key by an ID which can be detected to be obsolete since a new ID is given to each key when the table is changed. In this manner instances can migrate to a new schema incrementally in conjunction with the update operation.

Client API's

An ODBC level 2 compliant API is supported under Windows and Unix.- It can use SQL or EQL as alternative query languages and takes full advantages of ODBC's features such as asynchronous execution and array parameters.

A Lisp based API for Allegro Common Lisp is available. It provides easy to use Lisp access to ODBC functionality. Moreover, CLOS objects can be made persistent and retrieved from a persistent store either by value based queries or identity based navigation.

Query Language

In addition to SQL 2 M2 provides a lower level query language EQL. The API's for both languages are the same. EQL allows the developer to explicitly control which indices are used and in which order joins are made. EQL also provides support for identity based retrieval.

Replication

Replication is supported at the transaction level. Any transaction can have a `circulation list' which holds the names of the hosts that will replicate it. Additionally transactions can be logged in separate replication feeds. Certain logs for instance may be kept longer than other logs. In this manner the systems designer has full control over the replication of transactions.

The replication server is a separate process. Normally the database server feeds it a stream of committed log entries to broadcast to replicators. When a server that has been separated from the network reconnects to the replication server it the replication server reconstructs the appropriate replication feed from the logs tagged by circulation lists.

Platform Requirements

The Server

The ETEP server software and the M2 database runs on RS6000 hardware and AIX operating system release 3.2 or higher. In the near future the software will be also available in Solaris 2.0 on SUN SPARCstation. Final configuration of the hardware depends on the number of users and online transactions. An example of the hardware needed is following:

CPU 100 Mhz or greater
RAM 32 MB minimum
Floppy disk drive
Hard disk 1GB minimum
High resolution 17" colour display
Ethernet for TCP/IP data datacommunication
DAT-recorder for backup purposes

The Client

The ETEP client software runs on MS Windows 3.x operating system and uses TCP/IP networking software such as Trumpet Winsock. In the near future the ETEP client is available on Windows 95. PC 386 with VGA monitor and 4 MB memory as a minimum is recommended. WWW- browser must support HTML 3.0, forms and tables. Appropriate WWW-bowser is Netscape 1.1 and emailer is Pegasus version 2.0.


ETC: Home page | in A Nutshell | Technology | News | Customer Support | Contacts