I’ve decided to leverage Visual Basic 6.0 for the majority of my development for one reason;
I enjoy it.
The Windows Speech API makes for a decent enough speech to text in quiet environments. But that’s not always the case.
My first goal is to recreate this screen:
First and foremost, my personal requirements are – the screen must be interacted with through spoken commands ONLY.
The goal is to design the screens similar to the holodeck screen – where the user is given hints on how to interact with the application – by the visual cues on the screen.
So first – I created a preliminary database scheme (all contained in a file name (Holodeck Management System.sql) which is in it’s first stages as I test interactions with spoken commands.
Here’s a snapshot of the treeview of the database itself in SQL Management Studio:
And here’s a snapshot of the tables in their current form:
Since I am interested in testing the live interactions AND screen design with ‘real data’ rather than mocked up screenshots, I inserted dummy data into the database – some ‘dream scenes’ I would have (many clearly sexual) – as well as inserted data contained on the screens from the real holodeck aboard the Starship Enterprise.
Not wanting to violate CBS’s copyrights on LCARS – the computer system depicted on Star Trek – I redesigned my holodeck screen and got as far as creating this screen (with live data):
And after saying the word “NEXT” which it Does “Hear’ Pretty well you get this screen:
From there, I worked on the speech commands and added functionality to go the previous screen, to pop up notepad, to pull up Microsoft Paint, which is about where I started running into problems with applications freezing and problems with the speech code I had implemented.
Using the Microsoft Windows XP’s task manager to shut down the development environment – this gave me an idea. Speech interaction without a keyboard and mouse – is already making it clear I am going to have to redesign the standard user interface to access the computer to begin with.
What does that mean? I had no clue. But LCARS – the computer system as designed in Star Trek – requires touching the computer screens. I wanted to get away from that. But how?
So what I started to do was began the process of creating a ‘mock up’ for speech only – of the task manager – a replacement for this:
I used the same (ugly) interface design I had- and after half a day’s work – I came to create this:
That screen actually leverages the WMI database (ie: Select * from Win32_Process) to obtain a current process list.
But let’s be real. It looks like absolute shit.
I was unhappy with my screen design.
In my effort to be ‘futurtistic’ – I had alienated the first customer.
Hoping to derail the self-destructive thought that winds up with exercises like this with me playing a video game for hours afterwards not feeling like I have accomplished anything…
I switched back to the speech programming.
I converted my modular programming to classes to hide the basic code like memory management and initialization from me so I could focus my attention on cleaning up the speech,
Here’s the Visual Basic 6.0 classes I have so far, which will still undergo some cleanup and pruning as I get into it more:
The critical classes I created for speech are CSpeech.cls – which handles the initialization and word/phrase management, CSpeechInterpreted.cls, which is more of a static preprocessor for translating abbreviations and numerics to their English equivalent.
The Interface – ISupportSpeech – should be pretty obvious – if an object supports speech interaction and has a vocabulary – then it supports this interface. Plain and simple.
This abstraction allowed me to focus on the problems I was encountering – speech translation – why was it hearing ‘notepad’ when I said ‘next’ – why was the computer hearing ‘SHUT DOWN’ when I wouldn’t say anything.
That last particular command can be rather annoying I might add, particularly if you’re in the middle of doing something, as it would shut down all my work in progress.
YES, all the commands were functional.
With Speech commands – it’s possible to weight certain ‘words’ and phrases’ higher – GOOGLE does this too when translating a word and finding words ‘like’ what you want with searches.
So I started playing with ‘weighing’ commands higher to make sure shut down didn’t happen. This didn’t work well. At all.
So I started messing with ‘predictive’ guessing and hypothesis forming – and added the word ‘bad’ – just to try to divert anything that didn’t directly match a phrase to the ‘bad’ functionality.
All to no avail. Here’s a debug dump of a quick little trial where I spoke absolutely NOTHING and there was nothing but background noise in the Starbuck’s I am at:
So not only is/was it recognizing things that weren’t said, but it wasn’t even using the ‘bad’ keyword I had weighted high to try to reinforce a stricter recognition.
I figure I’ll train the speech engine in a quiet environment a few more times, I have already spent an hour training it, and then I will wait til I have done that to train it in a louder environment so it at least has a happier baseline where it should be able to understand my voice as opposed to other voices it didn’t train with.
But this all led me to one conclusion.
I needed to work on the User Interface.
Graphical User Interface Progress:
My attention drifted from speech – to ask the question:
What in sam’s hell am i going to do about the user interface?
Now my requirements for the ‘user interface’ is pretty complex:
1) It MUST support 3D rendering.
Ayup. That’s it for my requirements. Easy smeasy right?
I’ve evaluated literally a hundred different vendors and supplier for interacting with 3d graphics. Among those would include Blender, Cryengine, Unreal, Torque, Maya, I could go on but I wont.
I have even evaluated games which I could ‘mod’ to achieve what I wanted to – Grand Theft Auto for instance – is EASILY ONE of the best engines out there which works on a Windows XP system.
Oh yeah. that’s part of my other personal requirements:
1) Must work under Windows XP as I do intend on buying the ‘no longer supported’ source code for Windows XP from Microsoft to make this into an Operating System one of these days.
2) It MUST be fast with only 1 Gig of Ram
3) I MUST be able to program it’s usage through Visual Basic 6.0.
Why? Cuz I like it. I told you that already. Get over it.
Now the problem with each of these approaches is – EACH one of them presents a massive learning curve.
And EACH one is different than the other.
And since it’s a hack to use GTA’s engine. Despite it’s purdiness. i eliminated that one pretty quickly. But that’s not to say I am not ‘learning’ from it.
As the real world is modeled after it. Err. did I just say that? Oh that would be silly and wouldn’t make logical sense, would it?
No. I meant the GTA engine is modeled MUCH after the real world 😉 Yeah, that’s it.
So I found some pretty hairy Visual Basic 6.0 code which actually had ‘primitives’ (spheres/cubes/cones) drawn with OpenGL.
So what I have been doing is converting that into easy to use 3D objects which I can texture and move rapidly.
Here’s my sphere object with an earth based texture placed over it:
And to demonstrate the rapid reuse and recreation of the sphere object – I went from 1 sphere to 4 in about 30 seconds, here:
And the really cool thing is – these are all 3D animated, actual rotating spheres. I placed a video on Youtube for your enjoyment:
My goal with all of this is to create a ‘virtual reality’ programming language and environment which leverages hand gestures, body movement, and voice commands in a virtual environment to ‘create’ and literally play in these virtual worlds.
“Holodeck: I would like a tree here that is this tall”
To which I would be pointing at a spot on the ground in front of me, and my hand – being 2 feet above the ground – would elicit the command from the holodeck to create a virtual tree right there, scaled to the height of my finger.
“No. Try Spruce.”
And the holodeck would then ‘rotate’ through it’s three dimensional models, finding a spruce tree, and then the 3d image would change to a spruce tree.
“OK Holodeck, age the tree about 100 years”
To make it look like the tree hadn’t just been planted. Maybe I was making a simulation of an older and more established city.
“Ok. Now put similar spruce trees here, and here, and here, about the same age”
The holodeck would then respond by placing trees in exactly the locations I just pointed to.
“Ok. Now scale my size to 10% of what it is right now.”
To which then I would be at eye level with modern ‘large’ sized spruce trees.
I would walk a little and point at a location:
“An old west house here.”
Al.l this to ‘rapidly’ build scenes for movies, tv shows, interactive adventure, or designs for starships, or houses that haven’t been built – pretty much anything you can dream of.
The cool thing is – with Visual Basic 6.0 –
It’s about as close to the Windows APIs as you can get without going to a less forgiving language such as C++, and since most of the optimized OpenGL work is done through APIs, then Visual Basic makes the perfectly suited language for it.
Here’s the simple code which both initializes the OpenGL leveraging any window – the goal is to build the message pump leveraging an object oriented structure where there are multiple GL windows with 3d going on simultaneously – all initialized the same way.
My ‘base design’ will preload all the textures and objects – and pre-position them – and the physical hierarchy for all windows in the system – and then ‘swap those in’ as needed – immediately – with no latency and no ‘start time’ resulting in lag.
Once they are loaded. Then the simulation becomes ‘dynamic’.
Why design like this? Nothing annoys me more in a video game than the dreaded load screen.
Now this will be even worse in a virtual environment – for instance, let’s say you walk down the block and hit a ‘loading zone’ – to actually wait ‘breaks the illusion’ f the virtual environment.
That is NOT what I want. I want the VR environment to be seamless.
And when transitions between locations may require different assets, then I will have ‘transition code’ like an airplane ride of a car ride where there’s the same scenery but it feels like you’re still making progress between scenes.
Here’s the object hierarchy I have so far for the VR/GL objects:
I leveraged code from NEHE (here).who has done some phenomenal work with GL, but for making his code readable and reusable, it is absolute crap.
Sorry. Nehe. You are clearly gifted in some areas and not in others.
What I will be doing is cleaning up my memory problems, I figure it’s associated with my port of the device contexts leveraging a standard VB form rather than the way he had it – createwindow – which makes it MUCh easier for me to work with and reuse the code and avoid crashing. Sure, it may not be as fast, but I can then relate the pump to a specific window and carry contextual information and clean up work with an object rather than a set of procedures..
In any case.
I am leaving feeling marginally productive today.
I have wicked memory leaks. Which I will resolve in time, that’s a problem with working in Visual Basic and APIs. If you screw something up, it’s REALLY not nice.
On that note:
A note to Mr Bill Gates:
I’d like to purchase Windows XP. And create my own operating system – Windows 3D based on XP’s core.
But more than that.
I’d like to offer you a partnership if you have any desire to code again – but more than that – if you would like to act as a business manager for what I’m doing. I could use help. And resources.
You know what I am doing now and why I have been annoying you.
And because I really don’t like managing the business end of things, I would like to propose to you a joint business venture.
I’m a sightseer, adventurer, and explorer, which led me to what it took mentally to piece this together – and the business end simple presents too many hassles that I do not wish to deal with.
You’ve changed my life with what you have done.
It’s my turn.