"A human being should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyze a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects." (Robert A. Heinlein)
Showing posts with label Database-population. Show all posts
Showing posts with label Database-population. Show all posts

Tuesday, 5 August 2014

Neo4j and Java: demos with an embedded Ne04j graph


After my first experience in installing Neo4j graph database I decided to continue my experiments by writing a little Java demo program. The scope of my program just to learn how to connect to a Neo4j embedded graph, to generate ,connect and query some hundreds of nodes. Neo4j site and the downloaded manual provide plenty of documentation about interfacing with Java, and the other supported languages.

Project set-up

Setting up a Java project is quit simple: just matter of including all Neo4j libraries jars, available in the 'lib' folder, in the project class-path. To make easier future projects set-up I prepared, in Netbeans, a custom library configuration.


Wednesday, 5 November 2008

Generating random people with PostgreSQL

I just installed PostgreSQL on my PIII550 linux 'server' and I used the SQL example from my last post to do some test. Only the random select query and the procedure needed some little rework due to the different syntax:
Here is the random selection query
SELECT NAME, SEX,
                   CURRENT_DATE - CAST(RANDOM()*365.25*100 AS INTEGER) AS DOB,
                  (SELECT SURNAME FROM SURNAMES ORDER BY RANDOM() LIMIT 1) AS SURNAME
FROM NAMES ORDER BY RANDOM() LIMIT 1;
 Here is the procedure:
CREATE OR REPLACE FUNCTION FILL(IN NUM INT) RETURNS VARCHAR(1) AS $$
    DECLARE
          N INTEGER;
       BEGIN
              N:=NUM;
             WHILE N > 0 LOOP
                  INSERT INTO PEOPLE(NAME,SEX,DOB,SURNAME)
                  SELECT NAME, SEX,
                                     CURRENT_DATE - CAST(RANDOM()*365.25*100 AS INTEGER) AS DOB,
                                   (SELECT SURNAME FROM SURNAMES ORDER BY RANDOM() LIMIT 1) AS SURNAME
                 FROM NAMES ORDER BY RANDOM() LIMIT 1;
                N := N - 1;
       END LOOP;
       RETURN 'T';
END; $$
LANGUAGE PLPGSQL;
(Update)
Forgot to say it takes about seven seconds to insert ten thousands rows ... quite faster than MySQL.

test=# SELECT NOW();SELECT FILL(10000);SELECT NOW();
now
——————————-
2008-11-09 17:32:53.985751+01 (1 row)
fill
——
T
(1 row)
now
——————————-
2008-11-09 17:32:59.359207+01
(1 row)
test=#

Friday, 31 October 2008

Populating a table with random people


It happened I had  to  fill a table with realistic but not real people data.Here how did I managed it (ported to MySQL sintax).
First lets create a people table like this :
CREATE TABLE PEOPLE(NAME VARCHAR(20),
                                     SURNAME VARCHAR(20),
                                     SEX VARCHAR(1),
                                     DOB DATE);
Then create tables with sample names and surnames:
CREATE TABLE NAMES(NAME VARCHAR(20), SEX VARCHAR(1));
CREATE TABLE SURNAMES(SURNAME VARCHAR(20));
And fill them with some values:
INSERT INTO NAMES(NAME,SEX) VALUES('Jhon','M');
INSERT INTO NAMES(NAME,SEX) VALUES('Jack','M');
INSERT INTO NAMES(NAME,SEX) VALUES('Jeff','M');
INSERT INTO NAMES(NAME,SEX) VALUES('Judy','F');
INSERT INTO NAMES(NAME,SEX) VALUES('Jade','F');
INSERT INTO NAMES(NAME,SEX) VALUES('Jane','F');
INSERT INTO SURNAMES(SURNAME) VALUES('Red');
INSERT INTO SURNAMES(SURNAME) VALUES('White');
INSERT INTO SURNAMES(SURNAME) VALUES('Green');
Now we'll be able to get a single random person with a query like this:
SELECT NAME, SEX,
           CURDATE() - INTERVAL RAND()*365.25*100 DAY AS DOB,
                 (SELECT SURNAME FROM SURNAMES ORDER BY RAND() LIMIT 1) AS SURNAME
FROM NAMES ORDER BY RAND() LIMIT 1;
We can write a simple procedure to populate the table with the desired number of rows:
CREATE PROCEDURE FILL(IN NUM INT)
   BEGIN
       WHILE NUM > 0 DO
          INSERT INTO PEOPLE(NAME,SEX,DOB,SURNAME)
          SELECT NAME, SEX,
                            CURDATE() - INTERVAL RAND()*365.25*100 DAY AS DOB,
                         (SELECT SURNAME FROM SURNAMES ORDER BY RAND() LIMIT 1) AS SURNAME
         FROM NAMES ORDER BY RAND() LIMIT 1;
         SET NUM = NUM-1;
      END WHILE;
END;
We can call this procedure like this:
mysql> call FILL(10000);
Query OK, 1 row affected (40.73 sec)
A ten thousand people population in forty seconds (on PIII at 550Mz) is not bad.
Of course many improvements can be done like adding a random place of birth (based on a table of places) or making date of birth distribution more realistic.