Table of contents
- Scalability Challenge
- Previous Approaches
- Our Approach
- Remark on Plan Optimality
Planning problems are known to scale poorly with conventional approaches. They require extreme amounts of RAM and compute to create plans with the real-world dataset. This document briefly discusses HyperC approach to the scalability challenge of the planning database.
- Planning 10 items delivery with 10 steps takes 10MB of RAM and 30 seconds
- Planning 1000 items with 20 steps needs 1’000’000 MB or RAM and 3e10 seconds
In other words, state-space grows very fast with the amount of data that needs to be processed.
Other than applying a hardcoded domain-specific method for a particular sub-problem, several universal approaches are in use for solving planning problems:
The idea is to define a problem with independent “agents” that communicate and make decisions independently. It is very easy to scale multi-agent task to many CPUs. The issue is that very few real-world processes act like independent agents and even if agents are present, the “actions” of the agents are highly dependent on each other. Additionally, multi-agent programming requires more brainpower than free-form modeling and is not popular.
Various GPU-based methods are under development. The challenge with applying GPUs is that GPUs have a specific architecture optimized for thousands of threads that must operate under the same execution flow branch. Whenever the threads stop being “in sync” the vectorized execution breaks and performance falls back to single-threaded. A typical planning problem is about exploring many different branches, so advanced trickery is required to set up such a task for the GPU. This makes crafting an efficient GPU-bound planning task extra challenging and could be intractable for humans to complete without a universal task translator, which does not exist yet.
An approach based on scalable LP- and SAT- solvers is the most widespread among the Operations Research community. It also provides the most performing solutions without any additional compute-heavy training or tuning step. The downside is that a specific skillset and experience is required to convert any real-world problem into an LP or SAT formalism. Additionally, any seemingly insignificant change to the problem definition may require a complete rewrite of the solution, multiplying the time and cost to develop.
HTNs have seen the most success as a universal approach to scalable planning in various applications. The downside is that it combines the features and issues of the above methods. It requires high engagement of the developer into managing and hand-tuning the problem definition and also experience that is hard to obtain quickly.
HyperC takes a novel approach to the problem that can be nicknamed as “decompose and reuse”.
Decompose means that we split the problem into many smaller sub-problems. The major implementation of this principle is in the “gradual grounding” approach where the highest-level expressivity construct (the database engine query) is being interpreted as a set of Python statements, then the ambiguities in Python variables are interpreted, and so on…
Reuse means that we use the system to optimize itself. The planning problem can be expressed as a planning problem. The automatic proof can be expressed as a planning problem, and so on. So we solve the planning problem and reuse the solver recursively.
- Given the poorly defined problem in expressive and non-deterministic formalism - optimize it for better performance
- Given we have a solution for small amount of items/trucks/logic steps, figure out how to solve for larger amount.
The first definition comes from the fact that it is very hard to isolate a “pure” problem from a real-world process so we expect the original definition to have unnecessary actions, effects, and missing “obvious” constraints. The model of the domain itself will have unnecessary detail that will pollute the heap with irrelevant or redundant facts and statements.
The second subtask is about learning to solve smaller problems and applying the knowledge to try to solve larger onces.
- STRIPS formalism
- Strictly first-order logic
- Filter everything through a layer of a subset of PDDL
This allows us to apply a large corpus of research and algorithms that exists in AI planning domain. For example it has been shown that sound plans can always be found if enough states (interpretations) are grounded.
In STRIPS, we can define the scaling problem with two statements:
- Find smaller grounding to be able to do an exhaustive search
- Use universal heuristics to search better than blind
It is easy to see that the smaller state-space resulting from fewer interpretations will allow to solve the problem much faster even without any additional heuristics. The universal heuristics can be applied from existing research as well as machine-learned on a concrete problem, a set of problems or universally with domain-independent embedding.
- Ahead-of-time Identity Interpretation - pre-computed interpretations of variable/object identity to reduce amount of absurd interpretations
- Automatic Side-Effect (ASE) - automatic proof that part of the definition has no effect on solution or plan and is safe to remove, reducing amount of objects/
- Action schema splitting - Action schema optimization based on breaking down big action schemas into smaller schemas with smaller interpretation/grounding
- Domain and fact-space cleanup removes unused actions and facts prior to running “smarter” portions of the system.
- Caching of function interpretations skips some compilation and optimization steps
- Select only relevant records from the database
- Parallel symbolic execution
- Machine learned partial grounding - reduces the search space by analyzing the existing plans and inferring additional knowledge about the domain. Shown to accelerate search by 1000x
- Genetic programming-based heuristic configuration optimization: Fast-downward, the underlying solver comes with a powerful search configuration language that allows construction of novel domain-specific heuristics. GP shown to accelerate by 1000x.
- Schema splitting optimization - configuring parameters of splitter function and heuristics to yield optimal grounding
- Reordering of preconditions [assign order scores to preconditions] - optimizing grounding path, splitter performance and search direction by reordering the precondtions
- Schema modification using automatic proof a. Interpreted without data (ex. bound propagation) b. Interpreted with data (ex. ALEPH) c. [In between]
- Learning new proof methods / partial plans from solved plans
- Metaplanner - recursive planner that defines and loads the planning problem as a planning problem. Creates universal embedding for machine learning. Applicable for infinite recursion of many “meta-levels”
- Hinting - simplified machine learning embeddable in PDDL domain definition directly: calculate sets of objects that can be selected by a specific action and injects as preconditions. Requires stabilization of object hashes.
- Grounding mining. Some problems may never fit into computer memory or be optimally solved (for example the task of task optimization). Grounding mining is a partial groudnging approach that works like a combination of a lifted planning and machine learned grounding. Should be better than lifted planning and blind search. Should only be applied at meta level (for tasks of optimizing the tasks and search).
- Deep meta applications of the planner a. Searching for a heuristic for better heuristic… b. Optimizing the task optimizer for task optimizer…
HyperC aims at outperforming humans at planning problems. So we provide no guarantee of optimality of the plan, nor do we provide any safety of the solution. However, some modes include double-evaluation using Python so plan correctness can be asserted with high level of confidence.
Thus HyperC is allowed to “cut corners” and “cheat” in order to find feasible solutions faster.