MACHINE LEARNING TECHNIQUES AS APPLIED TO DISCRETE AND COMBINATORIAL STRUCTURES is a well-researched Physical Sciences and Mathematics Thesis/Dissertation topic, it is to be used as a guide or framework for your Academic Research.
Machine learning techniques, while well established for many observation types, have only recently come onto the scene for graphs and other combinatorial objects. Further, the use and efficacy of machine learning techniques in predicting computationally difficult invariants on discrete and combinatorial objects is next to the unknown.
This Thesis outlines methodologies useful for articulating discrete structures in the paradigm of many machine learning algorithms. Moreover, we examine several NP-hard problems in different representations. We then report on the results of various techniques and methodologies in solving certain families of these problems.
Machine learning techniques are well established for many observation types and application domains, ranging from ecology to image recognition to language translation. Nevertheless,
machine-learning techniques have only recently come onto the scene for use in graphs and other combinatorial objects, such as matrices operating under algebraic operations outside the fields of R and C.
Further, the use and efficacy of machine learning techniques in predicting computationally difficult invariants on discrete combinatorial objects is next to the unknown.
This Thesis outlines methodologies useful for articulating discrete structures in the paradigm of many machine learning algorithms. Moreover, we examine several NP-hard problems in different articulations. We then report on the results of various techniques and methodologies in solving certain families of these problems.
1.1 Motivating Problems This thesis focuses on three exemplary decision problems:
• The Boolean Matrix Factorization Problem
• The Tournament Isomorphism Problem
• The Feedback Asset Number Problem
These problems are selected as motivating examples since the underlying decidability of these problems lies in the computational class NP. In particular, the Boolean Matrix Factorization problem is NP-Complete , as well as the Feedback Asset Number Problem .
While the Tournament Isomorphism problem resides in DLOGSPACE (a subset of P class problems), the best-known algorithm has a time complexity of O(n log2(n)) , making the problem intractable in practical settings.
Formally, a decision problem X asks the following question: “Given a set of n constraints, does there exist at least one solution? Yes or no? ” If X can be decided in polynomial time with respect to n, then X is said to reside in the computational complexity class P. As a side note, we are often interested in asking subsidiary questions:
Assuming the response to X is “Yes, there exists at least one solution”, we subsequently ask, “What is (are) the solution(s)? Further, if there are multiple solutions that satisfy the n constraints, which is the optimal solution given a particular optimality criterion? ”
Suppose X is articulated as an optimization problem and the ordering of potential
solutions can be done in polynomial time. Then, as the optimal ordering of solutions is predicated on our knowledge of the existence of solutions to order, X falls into the same complexity class as its underlying decision problem.
The class NP is defined as the set of decision problems that have polynomial-time verifiers. A verifier is an algorithm which, when given decision problem X and additional information c (perhaps given from the oracle), can determine whether there exists at least one solution to X. In practice, this additional information c is often a solution to X itself.
As an illustrative example, consider the following articulation of the Traveling Salesman Problem: “Given a set of cities S and routes between cities R, does there exist a sequence of routes (r1, r2,⋯, rn) with ri ∈ R such that each city s ∈ S is visited exactly once, save for the starting city, which is returned to at the very end of the journey? ”
We recognize this problem to be in the computational complexity class NP since if we had a proposed solution (r1, r2,⋯, rn) given by the oracle which matched the requirements of visiting each city s exactly once (save for the first city, which we return to at the conclusion of rn), we could verify that this is, in fact, a solution in polynomial time.
Other variants of the Traveling Salesman Problem, such as those which apply weights to the routes, often induce a kind of optimization ordering among possible solutions. The underlying decision problem, “Does there exist at least one solution, regardless of route weighting?” remains the same.
There are other computational complexity classes, such as NP-Hard and NP-Complete, which give finer granularity to the classification of a decision problem being within NP. We refer the curious reader to  for a full exposition of these classes.
Our immediate motivation behind examining of each of the three problems of Boolean Matrix Factorization, Tournament Isomorphism, and the Feedback Asset Number, is due to all being decision problems (or optimization corollaries) which are computationally expensive to compute.