Basic data analysis guide for your thesis
Data analysis is the chapter that intimidates students the most. It's where your research goes from theory to evidence. But before you panic, you need to understand what type of analysis fits your research, which tools to use, and how to present your findings in a way that convinces your committee.
This guide walks you through the entire process, from choosing the right approach to avoiding the mistakes that sink theses at universities like UNAH, UTH, UNITEC, CEUTEC, and UPN in Honduras.
Quantitative vs Qualitative: which do you need
The first thing to understand is that the type of analysis depends on your research question, not your personal preference.
Quantitative: You work with numbers — Likert-scale surveys, measurements, statistical data. You need descriptive statistics (mean, standard deviation, frequencies) and possibly inferential statistics (chi-square, Student's t-test, correlations, regressions).
Qualitative: You work with words — in-depth interviews, field observations, focus groups. You need transcription, thematic coding, triangulation, and discourse analysis.
Mixed methods: You combine both. For example, you apply a quantitative survey to 200 UNAH students and then conduct 10 in-depth interviews to understand the "why" behind the numbers. This approach is increasingly common in graduate theses.
How to decide which one to use
| Your research question... | Recommended approach |
|---|---|
| Seeks to measure, quantify, or compare | Quantitative |
| Seeks to understand experiences or meanings | Qualitative |
| Seeks to measure AND understand in depth | Mixed methods |
| Has a hypothesis with measurable variables | Quantitative |
| Explores a poorly studied phenomenon | Qualitative |
The four types of quantitative analysis
Not all quantitative analysis is the same. Depending on your objectives, you'll use one or more of these types:
1. Descriptive analysis
Describes what's in your data. This is the mandatory starting point.
- Frequencies and percentages (how many said yes, how many said no)
- Measures of central tendency (mean, median, mode)
- Measures of dispersion (standard deviation, range, variance)
- Frequency tables and bar charts
Real example: In a Business Administration thesis at UTH about job satisfaction, descriptive analysis tells you that 65% of employees rate their satisfaction as "high" and the Likert scale average is 3.8 out of 5.
2. Inferential analysis
Goes beyond describing — it seeks to generalize your sample results to the entire population.
- Student's t-test: compare means between two groups
- ANOVA: compare means among three or more groups
- Chi-square: association between categorical variables
- Normality tests (Shapiro-Wilk, Kolmogorov-Smirnov)
Real example: In a Psychology thesis at UNAH, you use the t-test to determine if there's a significant difference in anxiety levels between students who work and those who don't.
3. Correlational analysis
Measures the relationship between two or more variables — whether when one goes up, the other does too (or goes down).
- Pearson correlation: for continuous variables with normal distribution
- Spearman correlation: for ordinal variables or non-normal distributions
- Coefficient of determination (R-squared)
Real example: In a Marketing thesis at UNITEC, you measure the correlation between social media investment and monthly sales of SMEs in Tegucigalpa. An r = 0.72 indicates a strong positive correlation.
4. Regression analysis
Predicts the value of one variable based on others. This is the most advanced analysis in undergraduate theses.
- Simple linear regression: one independent variable
- Multiple linear regression: several independent variables
- Logistic regression: when the dependent variable is categorical (yes/no)
Real example: In an Industrial Engineering thesis at UPN, you predict production time based on the number of operators, shift temperature, and training hours.
Step-by-step data analysis process
Follow these steps in order. Skipping any of them is the most common reason for rejection of the results chapter.
Step 1: Organize your database
Before running any test, you need a clean database.
- Each row is a case (one respondent, one observation)
- Each column is a variable (age, gender, response to question 1, etc.)
- Code the responses: "Strongly agree" = 5, "Agree" = 4, etc.
- Check for missing data and decide how to handle it (delete the case, replace with mean, etc.)
- Save the file in .xlsx or .csv format
Step 2: Run descriptive analysis
Always start here. Calculate frequencies, percentages, means, and standard deviations for each variable.
Step 3: Check statistical assumptions
Before applying inferential tests, verify:
- Normality: Shapiro-Wilk (small samples) or Kolmogorov-Smirnov (large samples)
- Homoscedasticity: Levene's test
- Independence of observations
If data is not normally distributed, use non-parametric tests (Mann-Whitney instead of t-test, Kruskal-Wallis instead of ANOVA).
Step 4: Apply the statistical tests
Select the test based on your objective and the type of variables:
| Objective | Variables | Test |
|---|---|---|
| Compare 2 groups | 1 categorical + 1 continuous | t-test / Mann-Whitney |
| Compare 3+ groups | 1 categorical + 1 continuous | ANOVA / Kruskal-Wallis |
| Association between categorical | 2 categorical | Chi-square |
| Relationship between continuous | 2 continuous | Pearson / Spearman |
| Prediction | 1 dependent + 1 or more independent | Regression |
Step 5: Interpret the results
It's not enough to report the p-value. You need to:
- Report the test statistic (t, F, chi-square, r)
- Report the p-value and compare it with your significance level (usually 0.05)
- Interpret in plain language what it means for your research
- Connect it back to your objectives and hypotheses
Step 6: Present with tables and charts
Every result needs a formal table or chart, complete with title, numbering, and source.
Tools for data analysis
For quantitative analysis
| Tool | Best for | Learning curve | Cost |
|---|---|---|---|
| Microsoft Excel | Basic descriptives, simple charts | Low | Included with Office |
| SPSS | Inferential tests, the most widely used in Honduras | Medium | University license |
| R / RStudio | Advanced analysis, professional graphics | High | Free |
| Jamovi | Free alternative to SPSS, user-friendly interface | Low-Medium | Free |
| Google Sheets | Basic descriptives, collaboration | Low | Free |
Recommendation for students in Honduras: If your university has an SPSS license (UNAH and UNITEC usually do), use it. If not, Jamovi is the best free alternative because it has a similar interface to SPSS and generates APA-formatted tables.
For qualitative analysis
| Tool | Best for | Cost |
|---|---|---|
| Atlas.ti | Professional thematic coding | License (student discount available) |
| MAXQDA | Mixed methods analysis | License |
| NVivo | Large volumes of qualitative data | License |
| Google Docs + color coding | Basic manual coding | Free |
How to present data correctly
Data presentation is where many theses lose points unnecessarily. Follow these rules:
Tables:
- Title above the table, numbered (Table 1, Table 2...)
- No vertical lines (APA standard)
- Source below the table
- Only the necessary horizontal lines
Charts:
- Title below the chart (Figure 1, Figure 2...)
- Bars for comparing categories
- Lines for trends over time
- Pie charts only when you have few categories (maximum 5-6)
- Always include data labels
Golden rule: Every table or chart should be understandable without reading the text. If someone only looks at the table, they should understand what you're showing.
The most common mistakes (and how to avoid them)
Mistake 1: Presenting data without interpreting it
"70% answered yes" doesn't say anything on its own. You need to explain what it means in the context of your research. "70% of surveyed professors at UNAH consider the virtual platform useful, which confirms hypothesis H1 that the perception of usefulness is predominantly positive."
Mistake 2: Using the wrong statistical test
Applying Pearson when your data isn't normally distributed, or using chi-square when your expected frequencies are below 5. Always verify the assumptions before choosing a test.
Mistake 3: Ignoring outliers
A single outlier can completely distort your results. Identify outliers with box plots and decide whether to remove them or report them separately.
Mistake 4: Confusing correlation with causation
Two variables being correlated doesn't mean one causes the other. "Students who sleep more have better grades" doesn't mean that sleep causes good grades — there may be intervening variables.
Mistake 5: Sample size too small
You need a formal sample size calculation BEFORE collecting data. A sample of 15 people is rarely sufficient for inferential tests.
Mistake 6: Not reporting instrument reliability
If you use a questionnaire, you need to report Cronbach's Alpha. A value below 0.70 indicates that your instrument is unreliable.
Complete example: thesis on student satisfaction
Let's imagine an Educational Administration thesis at UNAH: "Student satisfaction with the virtual modality in the Faculty of Social Sciences."
- Approach: Quantitative, descriptive-correlational scope
- Instrument: 25-item questionnaire with 5-point Likert scale
- Sample: 280 students (calculated using the finite population formula)
- Descriptive analysis: Satisfaction mean = 3.2/5, standard deviation = 0.89
- Reliability: Cronbach's Alpha = 0.87 (reliable)
- Inferential analysis: t-test to compare satisfaction between men and women (p = 0.032, significant difference found)
- Correlational analysis: Spearman correlation between satisfaction and academic performance (r = 0.45, moderate correlation)
- Interpretation: Students show moderate satisfaction. Women report higher satisfaction than men. A moderate positive relationship exists between satisfaction and performance, suggesting that improving the virtual experience could positively impact grades.
Qualitative analysis step by step
If your research is qualitative, the process is different but equally rigorous:
- Transcription: Transcribe all interviews or focus groups verbatim
- Initial reading: Read all the material without coding — familiarize yourself with the data
- Open coding: Assign codes to meaningful fragments of text
- Axial coding: Group codes into categories and subcategories
- Triangulation: Compare findings across different sources (interviews, documents, observations)
- Theoretical saturation: Determine when no new categories emerge
- Writing findings: Present each category with direct quotes from participants
Important tip: In qualitative theses in Honduras, committees typically require a minimum of 8-10 interviews to consider saturation adequate. At UNAH, some faculties require at least 12.
Data analysis is where we add the most value. We apply the correct statistical tests, create professional visualizations, and write the interpretation connected to your objectives. Get a quote for your analysis.
A solid data analysis can make the difference between an approved thesis and a rejected one. Don't leave this part to chance — it's the backbone of your entire research.
Need help with your project?
Our team can handle your thesis, research or technology project.
Get a quote