Design Overview


This post explains the motivation for the terraAI project. Below we summarize the overall design principle.

In order to rectify the problems of the Dark KBs mentioned above, we seek to create an open-source and public-access online platform where everyone can join in to build up the KB and the associated tools together. For this platform we shall name it the terraAI project.

Following are some high-level design principle of terraAI:

  1. Open source. The code is open sourced. And we also aim to use open-source code libraries whenever possible. The KB created are also open-sourced, unless it is of the personal nature.
  2. Knowledge-driven. Tools in this system will be driven by well-defined knowledge, knowledge are acquired through Machine Learning methods, and implicit knowledge can be inferred.
  3. Semi-automated KB. We aim to avoid relying entirely on highly-skilled AI professionals for creating the KB like many of the previous endevor. This means that this system will create and validate its KB with the aid of Machine Learning technologies whenever possible. Such KB of course should also be validated by AI professional (which is much easier than having them doing everything), or by the crowd (see below) when applicable.
  4. Online community - we aim to create a connected user and developer community over the Internet.
  5. Crowd-driven. We want this to be an online platform, and be as accessible to the general public as possible. Why? We believe that it is only with the wide participation of the general public (i.e., not just for programmers or AI researchers) that we have any hope of creating a KB with the immense complexity and richness of our world, and competing with the well-funded Dark KBs. This goal of course is quite difficult (see the ease of use item below), but we believe that it is worth the extra effort. Separately we also aim to offer useful personal tools driven by the terraAI KB, such as in the form of intelligent personal assistant and other apps, in order to reward users for their effort and also to smooth the contribution process.
    It is worth no
  6. Ease of use - in order to allow the contribution from the general public, the user interface has to be exceeding easy to understand and use. It is expected that the UI will need to support some degree of natural language understanding, advanced visualization etc.
  7. Serve as an online platform for developers. We want to make it easy for the developer community, even for those with little knowledge of the relevant Artificial Intelligence technologies.
  8. Distributed client-side processing. We aim to move as much processing as possible to the client side, which should make such a system much more scalable. One part of this is about how to achieve Distributed Machine Learning.
  9. Access from anywhere, meaning support mobile devices, support all major browsers, follow responsive UI design principle, etc.

Further design details about the terraAI project will be discussed in subsequent posts.

Guiding principle
  1. Open source
  2. Knowledge-driven
  3. Semi-automated KB
  4. Online community
  5. Crowd-driven
  6. Ease of use
  7. Developer-friendly
  8. Distributed

  9. Privacy


If we are not careful about the scope of this project we could chew off a piece so large that we will never come up with anything useful. As such we will divide this project into phases, starting with something simple and practical, while at the same time try to make it forward-compatible with later phases.

  1. Phase 1
    1. Target text-based information only.
    2. Piggy-back on top of popular search engines.
    3. Define a knowledge representation suitable for...
  2. Phase 2

    1. Piggy-back on top of popular image search engines, as well as reverse image search engines, for adding tags to whole images (i.e, without image segmentation capability.
    2. jjj
  3. Phase 3: to be defined

  1. Target simple tools first, even if they don't seem all that AI-ish. Put it another way, get a good-size crowd to use it, then AI will follow.
  2. Try to piggy-back on top of the existing Internet technologies, such as the text-based search engines, text-based image search engines, reverse image search engines, etc.
  3. Start with text-based machine-learning algorithms first, before moving on to images and beyond.
  4. Aim to perfect the crowd socialization environment first, so that we are able to use the crowd power to push it forward.
  5. Support the live onion architecture (yeah, I just invented the term. Let me know if you have a better one), so that we can empower the technical community to help us early on. What this means is that we want to make it easy for a developer to overlay his version of a certain TAI code module (e.g., a new machine learning algorithm, a new visualization method, etc.) on the live TAI system so that he can try it out easily. He can also share it with others for them to try it out online. Once the TAI administrator accepts and publish it, then all users have the option of using this developer's new module, or staying with the standard official version. Other developers can also do additional enhancements on top of this developer's module (hence the onion moniker).
  6. We will go deep instead of broad. For example, this system will have capability to converse with the user in natural languages (e.g., English)
Distributed Machine Learning

As mentioned earlier, we have made a number of design decisions, i.e.:

  1. Machine learning plays a pivotal role in this system, where it is used for acquiring the requisite knowledge in either the supervised or unsupervised manner. Toward this goal we will be using the artificial neural networks (i.e., ANN) for the purpose.
  2. Offload computation to the multitude of client devices as much as possible (as mentioned above).

Given these design decisions, it is then implied that we need to figure out how to perform a kind of highly-distributed and loosely-coupled machine learning using ANN. One way to achieve this is as follows:

  1. Borrow ideas from the paper Decoupled Neural Interfaces using Synthetic Gradients, where we can use DNI (Decoupled Neural Interfaces) and message-based communication to divide a large ANN system into many smaller sub-modules that can be independently and asynchronously updated.
  2. Follow the loosely-coupled design principle and find ways to modularize the system. Note that the DNI paper mentioned above is concerned with decoupling individual layers within an ANN, while here we aim to do the same mainly at a much coarser granularity.
  3. (How to perform incremental learning?)

Further details about this part will be discussed in a separate post.

System components


Advanced Use Cases

Listed below are some advanced use cases. They are advanced in the sense that we don't expect to be able to solved them in the near term, but nonetheless they are useful as guide post cases for validating our design and architecture from time to time, in order to ensure that we have not strayed too far off.

You might also call these advanced use cases the SHC, or the Sherlock Holmes cases, since it involves the use of rich background knowledge along with a series of inexact inferences in order to converge on a likely explanation.

Advanced Use Case #1: find geo-location of an image
Problem: given the following photo as well the descriptive text "I watch this whale from roadside with some 20 other people in Oregon," find the exact location where this photo was taken, using nothing but a standard web browser.

This may seem like an extremely difficult problem, but I was able to solve it in about 15 minutes as follows:

  1. Infer the following from the given data:
    1. Since whales live in the ocean, this must be somewhere along the Oregon coastline
    2. Sandy beaches can be eliminated, since whales cannot swim in shallow water.
    3. The location is perhaps a road-side parking area (or some open space easily reachable by cars) large enough to hold 20+ people.
  2. Show the Oregon coastline in Google Maps, change its the Satellite mode.
  3. Scroll down the map along the coastline and find locations that match the above criteria. Turn out that the Boiler Bay State Scenic Viewpoint is a good match.
  4. Go into Google Street View and Google Image Search for the said area, and confirmed the rock formation at that location is a good match of what's in the photo.

So what does it take for TAI to achieve the same? Lots.

  1. Have knowledge that whales are large animal that live in the ocean.
  2. Ability to recognize a whale from a photo.
  3. Have knowledge that whales cannot swim on a beach
  4. Have knowledge of what beaches look like in a satellite image, i.e., with a lot of white wave caps.
  5. Ability to distinguish sandy beach from rocky shore in a satellite image.
  6. Ability to operate Google Maps, and use it to find candidate locations that match the above criteria.
  7. Ability to verify views in the candidate location (e.g., from Google Street view, or Google image search, etc.) roughly match the photo.
  8. And so on, and so forth.

So what does the use case above tell us? Here are some that I can think of:

  1. Having the capability to localize the recognize objects in an image, and also associate them with related text and background knowledge is extremely powerful.
  2. Having a rich of background knowledge and be able to conduct inferences, event if inexact, can be quite helpful. Due to the inexact nature of the general knowledge about our world, it is more useful to consider such inferences NOT as for the purpose of deducing truth statements, but rather as a way to narrow the scope of search.
  3. We human use many online tools to magnify our information processing power. It would be nice if TAI can do the same as well (perhaps with advanced image recognition and machine learning algorithms), someday.

Do you have any further insight? Let me know by entering your comments below!




comments powered by Disqus