What if we had oracles for common machine learning problems?

Rough working notes, musing out loud.

Much effort in machine learning and AI research is focused on a few
broad classes of problem. Three examples of such classes are:

  • Classifiers, which do things like classify images according to their
    category, generalizing from their training data so they can classify
    previously unseen data in the wild;

  • Generative models, which are exposed to data from some distribution
    (say, images of houses), and then build a new model which can
    generate images of houses not in the training distribution. In some
    very rough sense, such generative models are developing a theory of
    the underlying distribution, and then using that theory to
    generalize so they can produce new samples from the distribution;

  • Reinforcement learning, where an agent uses actions to explore some
    environment, and tries to learn a control policy to maximize
    expected reward.

These are old problem classes, going back to the 1970s or earlier, and
each has seen tens of thousands of papers. Each of these problem
classes is really beautiful: they’re hard, but not so hard it’s
impossible to make progress; they’re precise enough that it’s possible
to say clearly when progress is being made; they’re useful, and seem
genuinely related to essential parts of the problem of AI.

I occasionally wonder, though, what’s the end game for these problem
classes? For instance, what will it mean if, in some future world,
we’re able to solve the classifier problem perfectly? How much would
that help us achieve the goal of general artificial intelligence? What
else would it let us achieve?

In other words, what happens if you skip over (say) the next few
decades of progress in classifiers, or generative models, or
reinforcement learning? And they become things you can just routinely
do essentially perfectly, perhaps even part of some standard library,
much as (say) sorting routines or random number generation can be
regarded as largely solved problems today. What other problems then
become either soluble, or at least tractable, which are intractable
today?

Perfect solutions don’t obviously help, even with closely adjacent
problems:
One obvious point is that you can make a great deal of
progress on one of these problems and it doesn’t necessarily help you
all that much even with problems which seem closely adjacent.

For instance, suppose you can classify images perfectly.

That doesn’t necessarily mean that you can solve the image
segmentation problem – identifying the different objects in some
general image.

And even if you can solve the image segmentation problem for static
images, that doesn’t mean you can solve it for video. I’ve watched
(static) image segmentation algorithms run on video, and they can be
remarkably unstable, with objects jumping in and out as we move from
frame to frame. In other words, the identity of an object across
frames is not obviously easy to track, even given perfect
classifiers. For instance, something like one object obscuring another
can cause considerable problems in making inferences about the
identity of the objects in a scene.

AI-complete problems: The problem classes described above are in
some sense very natural problems, the kind that would occur to anyone
who thought about things like how humans recognize images, how they
create new images, or how they play games. But you can ask a very
different question, a much more top-down question, which is whether
there is some class of problem which, if you could solve that, would
enable you to build a genuinely artificially intelligent machine as a
byproduct?

This notion is called AI-completeness
(Wikipedia entry). According to Wikipedia the term was coined by
the researcher Fanya Montalvo in the 1980s.

It’s interesting to read speculation about what problems would be
AI-complete.

The classic Turing test may be viewed as an assertion that the problem
of passing the Turing test – routinely winning the imitation
game against competent humans – is AI-complete.

Another example which is sometimes given is the problem of machine
translation. At first this seems ridiculous: the best machine
translation services can now do a serviceable job translating many
texts, and yet we’re very unlikely to be close to general artificial
intelligence.

Of course, those services don’t yet do excellent translations. And
some of the problems they face in order to do truly superb
translations are very interesting.

For instance: very good translations of a novel or a poem may require
the ability to track allusions, word-play, contrasts in mood,
contrasts in character, and so on, across long stretches of text. It
can require an understanding of quite a bit about the reader’s state
of mind, and perhaps even very complex pieces of folk psychology
– how the author thought the reader would think about the impact
one character’s changing relationship with a second character would
have on a third character. That sounds very complicated, but is
utterly routine in fiction. Certainly, producing excellent
translations is an extremely difficult problem which requires enormous
amounts of understanding.

That said, I’m not sure machine translation is AI-complete. Even if a
machine translation program did all those things, it’s not obvious you
can take what is learned and use it to do other things. This is
evident for certain tasks – learning to do machine translation,
no matter how well, probably will only help a tiny bit with (say)
robotics or machine vision. But I think it may be true even for
problems which seem much more in-domain. For example, suppose your
machine translation system can prepare first-rate translations of
difficult math books. It might be argued that there is some sense in
which they are truly understanding the mathematics. But even if
that’s the case – and it’s not obvious – that
understanding may be not be accessible in other ways.

To illustrate this point, let’s grant, for the sake of argument, that
the putative perfect math-translation system really does understand
mathematics deeply. Unfortunately, that doesn’t imply we can make use
of that understanding to do other things. It doesn’t mean we can ask
questions of the system. It doesn’t mean the system can prove
theorems. And it doesn’t mean the system can conjecture new theorems,
conjure up new definitions, and so on. Much of the relevant
understanding of mathematics may well be available inside the
system. But it doesn’t know how to utilize it. Now, it’s potentially
the case that we can use some kind of transfer learning to make it
significantly easier to solve those other problems. But that’d need
to be established in any given context.

For these reasons, I’m skeptical that narrowly-scoped AI-complete
problems exist.

Summary points

  • A useful question: given the black-box ability to train a perfect
    classifier (or generative model or reinforcement learning system or
    [etc]), what other abilities would that give us? I am, I must
    admit, disappointed in my ability to give interesting answers to
    this question. Worth thinking more about.

  • The Turing Test as an assertion that the Imitation Game is
    AI-complete.

  • No narrowly-scoped problem can be AI-complete. The trouble is that
    if it’s narrowly scoped then while the system may in some sense have
    a deep internal understanding, that doesn’t mean that understanding
    can be used to solve other problems, even in closely-adjacent areas.
    Put another way: there is still a transfer learning problem, and
    it’s not at all obvious that problem will be easy. Put still
    another way: interface matters.

Liked Liked