How to Craft a Strong AI/ML Thesis Statement
Author(s): Ayo Akinkugbe Originally published on Towards AI. Defining Scope, Hypotheses, and Contribution Boundaries for Clarity, Testability, and Impact in AI & ML Research Photo by Omar:. Lopez-Rincon on Unsplash Snapshot A thesis statement is the central claim of your dissertation or research. It articulates the core idea your entire AI/ML project is designed to test and validate. It is your intellectual anchor, guiding every methodological decision and framing the contribution you aim to make to the field. A well-crafted thesis statement does more than summarize your topic. It establishes the logic of your research. It clarifies what you believe is true, what you intend to test, and why your work matters. Many doctoral candidates struggle with this step because they either write statements that are too broad (Example — “This research explores interpretability in machine learning”) or too vague (“This project looks at improving model performance”). The diagram below delineates the characteristics of a weak thesis compared to a strong one. A strong thesis statement should be specific, testable, and firmly tied to a research gap in the literature. An example: “This research investigates whether L1-regularized sparse autoencoders improve reconstruction performance on tabular clinical datasets compared to standard autoencoders.” Before defining scope or contribution, a thesis must articulate testable claims — which is why hypothesis formulation is the logical first step. Also your claim must be falsifiable. If not then it is not a thesis. To make your thesis actionable and defensible, it is helpful to understand how it interacts with other key research elements: Hypotheses : These are testable statements derived from your thesis. The alternative hypothesis (H1) expresses the expected effect, while the null hypothesis (H0) asserts no effect. Hypotheses define the experiments or analyses that can support or refute your claim. Scope: Scope defines where and how your thesis applies — the models, datasets, domains, and metrics included. It ensures your research remains feasible, rigorous, and focused. Delimitations: These are intentional exclusions — aspects you decide not to study due to practical or methodological constraints. Delimitations help reviewers understand the boundaries of your research and prevent misinterpretation. Contribution Boundaries: Your thesis implicitly or explicitly defines what your work adds to the field. Contributions can be algorithmic, empirical, theoretical, methodological, or applied. Clear boundaries prevent over claiming and clarify the intellectual value of your work. Your thesis claim should be falsifiable. If not, it isn’t a thesis. In this article, we explore how to craft hypotheses, define scope and delimitations, and clarify contribution boundaries, showing how all of these elements integrate to form a credible thesis statement. Understanding how your thesis interacts with these research components, allows for designing research that is meaningful and manageable, giving your work the needed clarity and impact from the jump. Workflow: From Thesis Statement to Evaluation The Thesis Statement is the claim being made Hypotheses are how you test the claim Scope is where or how the claim is valid Your contribution is why the claim matters to the field Hypothesis Formulation: Building H1 and H0 A strong thesis statement in AI/ML almost always includes a hypothesis: which is a provisional claim you intend to support or reject with evidence. The hypothesis is the backbone of your methodology because it defines the relationship you aim to test. Most research uses two complementary components: the alternative hypothesis (H1) and the null hypothesis (H0). You could have multiple of this pair for your research (H1, H2, H3 . . . ) depending on scope. The alternative hypothesis (H1) is the statement you believe will be supported. It represents the expected effect or relationship in your study. For example, in an AI-generated text detection research project, H1 might be: “A curvature-based detection method (inspired by DetectGPT) will outperform traditional perplexity-based detectors under paraphrasing attacks by . . .” This assumes an expected direction of improvement. The hypothesis is the backbone of your methodology because it defines the relationship you aim to test The null hypothesis (H0) is the opposite: it states there is no effect, no improvement, or no detectable relationship. A well-formulated null hypothesis would be: “A curvature-based detection method performs no better than existing perplexity-based detectors under paraphrasing transformations.” The key is that H0 must be testable. You must be able to reject or fail to reject it based on experimental evidence. In AI/ML, hypotheses typically fall into three categories: Performance hypotheses — claiming improvements in accuracy, robustness, or generalization. Behavioral hypotheses — asserting differences in how models act under stressors, perturbations, or distribution shifts. Structural hypotheses — claiming that a new architecture, representation, or training strategy provides qualitative benefits (e.g., interpretability or efficiency). Clear hypotheses push you toward concrete evaluation metrics, well-chosen baselines, and reproducible experiments. A thesis without hypotheses often becomes descriptive rather than analytical. Defining Scope: What Your Research Will Cover Scope refers to the boundaries of your research in terms of concepts, models, datasets, and methods. Defining scope is important because it prevents your thesis from becoming unmanageably broad — a common risk in fast-moving fields like AI/ML. Scope is not about limiting ambition; it is about designing a project that is achievable, rigorous, and defensible. For example, suppose you are studying interpretability methods for large language models. Your scope might specify that you will focus only on transformer-based architectures between 7B and 13B parameters, using English-language corpora and analyzing token-level attribution methods. This scope is clear and reasonable. It signals to the reader what is included in your inquiry and prepares them for the methodological choices you will make later. Defining scope is essential because it prevents your thesis from becoming unmanageably broad — a common risk in fast-moving fields like AI/ML. Strong scope statements often include: The specific models or techniques being studied (e.g., sparse autoencoders, RNNs, diffusion models) The domain or application area (e.g., medical NLP, autonomous driving, cybersecurity) The datasets or types of data (e.g., tabular clinical data, synthetic benchmarks, conversational text) The metrics or evaluation frameworks (e.g., MAUVE, F1, calibration error, computational […]