Database Benchmark
Some people created a nice well made GUI based tool for comparing Database performance and size. I used this tool for performance tuning of VelocityDB. The benchmarks are limited in scope but for what they test, they are very good and compare
11 different Databases.
VelocityDB performes nicely in comparison. Only the mysterious STSdb performes better in some cases but STSdb has a very limited API, isn't distributable, and cannot directly store classes with relations to other classes. The entire benchmarks lacks such cases and also doesn't test large objects.
Databases part of the benchmark include: STSdb, Acess 2007, db4objects, Firebird, H2, MS SQL Server, MySQL, Oracle BerkeleyDB, Perst, and SQLite. VelocityDB is not yet part of the
official benchmark but I will try to have it added.
You can test VelocityDB with this tool using the VelocityDB implementation of the benchmark tests. The tests currently do not use .net 4 which is what VelocityDB requires but you can make it work using .Net 4.0.
VelocityDB files: VelocityDbBenchmarkCompressed.zip
. The C# source code for the entire tool can be downloaded from
here. The benchmark shown here is with RecordCount being 250,000.
Adds [RecordCount] number of records to the database. The database must give an opportunity to directly access the added records by their index - 0, 1, 2, ... RecordCount - 1 thereafter.
As you can see, VelocityDB is way faster than these other Database Systems.
Click here to see the results for the other tests part of the 3rd party test suite
Here a smaller bar is better! Not only is VelocityDB faster, it does it producing smaller databases, especially when compression is enabled.
A user asked for a comparison with Eloquera. We implemented the first two tests using Eloquera - we can implement the remaining ones on request
This time we used a faster PC for the benchmark, a ZT Affinity 7644Mi Desktop PC
, and 5 million records were added. The src for the Eloquera implementation is
here
Enumerate the records in order that they are added with the Append() method.
All of these Database systems adds the same type of Tick object , see class definition below.
public class Tick
{
public string Symbol { get; set; }
public DateTime Timestamp { get; set; }
public double Bid { get; set; }
public double Ask { get; set; }
public long Volume { get; set; }
public string Exchange { get; set; }
}
Triangle Counter
This test idea started with a Vertica comparison with Hadoop and PIG.
Given 86,220,856 tuples compute the number of triangles that can be formed from these edges. We were not able to run the Vertica, Hadoop or PIG solution yet due to build issues (missing build.xml).
Note that VelocityDB only needs about 387MB to store the data for this while Hadoop is claimed to need 160GB !!
This same test was also performed using Oracle on a very expensive system.
After the Oracle publication came a
Kognitio press release where they claim to beat the other
databases.
We ran a VelocityDB version of this test on a Gateway SX2800 C2Q Desktop PC purchased refurbished for $329 in 2009
plus a $30 memory upgrade in December 2011.
Using this low end hardware with Windows 7 and 8GB memory, we most likely outperform all the other
databases.
Using the Gateway PC, VelocityDB counts the triangles between the 4846609 nodes in test data in about 99 seconds plus 30 seconds for loading the database using 4 threads.
Now, with a more powerful
ZT Affinity 7644Mi Desktop PC (made in USA), we see improved numbers: 42 seconds plus 14 seconds to load the database. The same calculations on a very powerful server using Hadoop: 13140 s, PIG: 9540 s, Vertica: 287 s.
We will test on a big server asap, we are confident we'll be better or near the
Oracle & Kognitio results when utilizing 96GB memory and 12 or more fast cores (cpu's). If you have access to
such a powerful Windows machine, please help us out by running the VelocityDB version of the triangle counter test. We want to
prove that we are #1!
C# VelocityDB code for TriangleCounter
using System;
using System.Runtime;
using System.Collections.Generic;
using System.Linq;
using System.IO;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
using VelocityDb;
using VelocityDb.Collection.BTree;
using VelocityDb.Session;
namespace TriangleCounter
{
class TriangleCounter
{
static readonly string systemDir = System.IO.Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments), "VelocityDB", "Databases", "TriangleCounter");
static int discoverTrianglesSingleCore(BTreeMap<int, int[]> edges)
{
int triangles = 0;
BTreeMapIterator<int, int[]> edgesItr = edges.Iterator();
while (edgesItr.MoveNext())
{
int nodeId = edgesItr.CurrentKey();
int[] edge = edgesItr.CurrentValue();
int stop = edge.Length - 1;
int i = stop;
int edgeToStart, edgeTo;
int pos;
while (i >= 0)
{
int[] edgeInfo2;
edgeToStart = edge[i--];
if (nodeId < edgeToStart)
{
if (edges.TryGetValue(edgeToStart, out edgeInfo2))
{
for (int j = stop; j >= i; j--)
{
edgeTo = edge[j];
if (edgeToStart < edgeTo)
{
pos = Array.BinarySearch<int>(edgeInfo2, edgeTo);
if (pos >= 0)
{ // we know this one is connected to edgeInfo.From because it is part of edgeInfo.To
triangles++;
}
}
else
break;
}
}
}
else
break;
}
}
return triangles;
}
static void Main(string[] args)
{
long triangles = 0;
try
{
using (SessionNoServer session = new SessionNoServer(systemDir))
{
int numberOfWorkerThreads = -1;
if (args.Length > 0)
{
if (!int.TryParse(args[0], out numberOfWorkerThreads))
Console.WriteLine("First parameter is numberOfWorkerThreads which must be an Int32");
}
Placement btreePlace = new Placement(40, 1, 1, 10000, 65500, true);
Placement edgeInfoPlace = new Placement(40, 10000, 1, 10000, 65500, true);
session.BeginUpdate();
BTreeMap<int, int[]> edges;
BTreeMapIterator<int, int[]> edgesItr;
int[] edge = null;
Database edgeDb = session.OpenDatabase(40, false, false);
if (edgeDb != null)
{
session.Commit();
session.BeginRead();
edges = (BTreeMap<int, int[]>)session.Open(40, 1, 1, false);
}
else
{
DatabaseLocation location = session.DatabaseLocations.Default();
//location.CompressPages = false; // no compression should make it faster (?)
session.NewDatabase(40, 395, "Edges");
edges = new BTreeMap<int, int[]>(null, session, 6000);
edges.Persist(btreePlace, session, true);
edgesItr = edges.Iterator();
using (StreamReader stream = new StreamReader("c:\\edges.txt", true))
{
int a;
int b;
string line;
string[] fields;
while ((line = stream.ReadLine()) != null)
{
fields = line.Split(' ');
if (!int.TryParse(fields[0], out a))
break;
b = int.Parse(fields[1]);
if (a != b)
{
if (edgesItr.CurrentKey() == a || edgesItr.GoTo(a))
{
edge = edgesItr.CurrentValue();
Array.Resize(ref edge, edge.Length + 1);
edge[edge.Length - 1] = b;
edgesItr.ReplaceValue(ref edge); // we need to update the value in the BTreeMap
}
else
{
edge = new int[1];
edge[0] = b;
edges.Add(a, edge);
}
}
}
}
edgesItr = edges.Iterator();
while (edgesItr.MoveNext())
{
edge = edgesItr.CurrentValue();
Array.Sort(edge);
edgesItr.ReplaceValue(ref edge);
}
session.Commit();
session.BeginRead();
}
Console.WriteLine("Number of Nodes found: " + edges.Count);
if (numberOfWorkerThreads > 0)
Console.WriteLine("Start of triangle discovery using " + numberOfWorkerThreads + " threads, time is " + DateTime.Now);
else if (numberOfWorkerThreads < 0)
Console.WriteLine("Start of triangle discovery using system automatically selected number of threads, time is " + DateTime.Now);
else
Console.WriteLine("Start of triangle discovery using main thread, time is " + DateTime.Now);
// Start counting triangles !
if (numberOfWorkerThreads != 0)
{
edgesItr = edges.Iterator();
ParallelOptions pOptions = new ParallelOptions();
pOptions.MaxDegreeOfParallelism = numberOfWorkerThreads;
// First type parameter is the type of the source elements
// Second type parameter is the type of the local data (subtotal)
Parallel.ForEach<KeyValuePair<int, int[]>, long>(edges, // source collection
pOptions,
() => 0, // method to initialize the local variable
(pair, loop, subtotal) => // method invoked by the loop on each iteration
{
int nodeId = pair.Key;
int[] nodeTo = pair.Value;
int stop = nodeTo.Length - 1;
int i = stop;
int edgeToStart, edgeTo;
int pos;
while (i >= 0)
{
int[] edgeInfo2;
edgeToStart = nodeTo[i--];
if (nodeId < edgeToStart)
{
if (edges.TryGetValue(edgeToStart, out edgeInfo2))
{
for (int j = stop; j >= i; j--)
{
edgeTo = nodeTo[j];
if (edgeToStart < edgeTo)
{
pos = Array.BinarySearch<int>(edgeInfo2, edgeTo);
if (pos >= 0)
{ // we know this one is connected to edgeInfo.From because it is part of edgeInfo.To
subtotal++;
}
}
else
break;
}
}
}
else
break;
}
return subtotal;
},
// Method to be executed when all loops have completed.
// finalResult is the final value of subtotal. supplied by the ForEach method.
(finalResult) => Interlocked.Add(ref triangles, finalResult));
}
else
triangles = discoverTrianglesSingleCore(edges);
session.Commit();
}
Console.WriteLine("Number of Triangles found: " + triangles);
}
catch (Exception e)
{
System.Console.WriteLine(e);
}
}
}
}
Computing the Kevin Bacon Numbers
This is a sample that shows a problem that is easy to solve with VelocityDB but may be difficult to do using only SQL and in any case, the VelocityDB solution will be faster!
Other database vendors use this same example, see
HPCC Systems and
Neo4J
It is a problem that is supposed to be assigned to graph databases but as you
can see it is very easily handled by VelocityDB and the performance is
unbeatable. We look forward to seeing numbers to compare with.
See our sample page for further info.