Principal component analysis for social sustainability in India’s automobile sector
Sustainability has become an important cornerstone for evaluating organizational practices. With growing recognition of sustainability across businesses due to resource depletion, changing customer demands, and regulatory pressure, the automotive industry is increasingly focused on using sustainable practices (Helman et al., 2023). This article focuses on social sustainability, a critical component for fostering societal well-being across generations.
Leading automobile companies are adopting strategies that meet UN sustainable development goals and contribute not only towards global sustainability but also in strengthening the brand value of the companies (Lukin et al., 2022). Previous literature has shown that sustainability performance of a company can broadly be judged by two major categories of indicators:
- Environmental Indicators: These measure the environmental impact of a company’s operations, including emissions, energy consumption, water usage, CO2 abatement, and waste management.
- Social Indicators: These reflect aspects of employee well-being, including workplace safety, diversity, community engagement, and overall workforce satisfaction.
Indicators of Social Sustainability
In deriving sustainable development, social indicators serve as an important component by encompassing the societal well-being of future and present generations (Husgafvel, 2021). Consisting of multiple dimensions i.e. equity, safety, quality of life, social inclusion, and adaptability, implementation of social sustainability is important for reshaping social well-being (Ly & Cope, 2023). Considerable number of research has taken place in the broader dimension of social sustainability. In a previous milestone, several past studies were reviewed and indentified 42 indicators of social sustainability which is also represented by 8 of the UN Sustainable Development Goals (SDGs). Of these some indicators were prominent in the automobile sector. Researchers such as Rajesh (2020) and Vijayakumar et al. ( 2022) identified the usage of variables like gender diversity and wages for measuring the social sustainability level of the industry. Herein, based on the previous milestone, this article focuses on utilizing Principal Component Analysis (PCA) to identify and assess key social sustainability indicators in the automotive sector.
The aim of this article is to identify which of the many social sustainability indicators are relevant to the Indian automobile industry. The purpose is to help the automobile sector improve its social sustainability practices by focusing specifically on the relevant indicators. This will help them not only achieve their sustainability goals in a simpler and more efficient way, but also improve the social performance of the country to a huge extent.
Data collection procedure for social sustainability indicators
The dataset utilized in this study contains comprehensive information on the social sustainability indicators reported by automotive companies. Seventeen automobile companies listed in the Bombay Stock Exchange (BSE) and National Stock Exchange (NSE) in the year 2024 were included in the dataset Although the study aimed to consider all of the listed automobile companies, some such as Sundaram Fasteners and Cummins India did not provide any information about their sustainability practices and outcomes in their annual reports. Therefore the final dataset consisted of the below 17 companies:
- Ashok Leyland
- Apollo Tyres
- Bajaj Auto
- Balakrishna Industries
- Bosch
- Eicher Motors
- Escorts Kubota
- Hero Motors
- Mahindra & Mahindra
- SML Isuzu
- Tata Motors
- Tube Investments
- TVS Motors
- Force Motors
- Maruti Suzuki
- Motherson Samvardhana
- MRF Tyres
In the previous milestone, over 15 social indicators were identified, which are relevant to the automobile and automotive sector; however, a careful examination of the above companies’ BSBR (business responsibility and sustainability report) and annual reports showed that many indicators’ data was not reported.
None of the companies reported their R&D investments in sustainable technology development, work culture, or investment in education. Therefore selected only those social sustainability indicators that were available in the BSBR and annual reports of most companies.
| Social sustainability indicators |
|---|
| Hired |
| Gender Diversity |
| LTIFR |
| Retention rate |
| Board diversity |
| Total trained employees/ workers |
| Independent Directors in Board |
| Total wages (in million) |
The purpose of dimension reduction of social sustainability indicators
While these indicators provide a comprehensive framework for sustainability reporting, it is unclear whether they are all equally effective in measuring their respective dimensions of sustainability. Some indicators may contribute significantly to defining environmental or social sustainability, while others might add little value or introduce redundancy. For instance, (Lisowski et al., 2020) initially identified 247 indicators of environmental sustainability in the automobile sector, of which only 31 were found to be relevant. Similarly, (Li et al., 2012) also found that out of 32 sustainability indicators, only 13 were found to be relevant after empirical examination of sustainability reports of electronics companies. Not only does dimensionality reduction offer a comprehensive examination of companies’ sustainability performance but it also identifies areas of improvement. Dimensionality reduction is important because there is a practical difficulty in working with a large number of metrics, and the goal is to reduce it to a more manageable set of core indicators addressing many different areas (Munier, 2011). Moreover, sustainability is a complex phenomenon, and to capture its complexity, composite indices must be created by integrating multiple indicators (Gan et al., 2017). Nathan and Reddy (2011), proposed the following criteria which must be met by a dimensionality reduction technique for sustainability indicators:
- The number of indicators must be neither too large nor too small, but an exact representation of the given situation.
- The chosen indicators must be relevant to the objective, simple to understand, analytically sound, and ensure policy responsiveness.
Therefore, Principal Component Analysis (PCA) test was used in this study to reduce the number of social sustainability indicators from 8 to 3 or 4, which accurately represent the companies’ social sustainability performance.
Principal Component Analysis (PCA) to extract relevant variables
PCA is a technique which helps in dimension reduction and better interpretation of large datasets (Jolliffe & Cadima, 2016). The usage of the PCA method helps in identifying uncover patterns and relationships in the dataset, measuring the contribution of each element in the dimension, and reducing dimensionality without losing information, therefore, the model is suitable herein to have the identification of the relevant measures for social sustainability. The usage of this method will help in
- A deeper understanding of the interrelationships between sustainability indicators, and
- Insights into the most critical indicators that drive environmental and social sustainability in companies.
Several past studies have advocated the use of PCA test for aggregating sustainability indicators. For instance, (Zhao, 2015) applied PCA test to extract principal components to represent nine indices covering four aspects of a country’s sustainability: economy, environment, resource and society.
Reduce the dimensionality of the data without significant information loss using linear transformation technique
Gan et al. (2017)
Wang and Chen (2022), aimed to study countries’ social sustainability performance beyond their gross domestic product (GDP) rankings and identified 13 potential indicators. They applied the PCA test to reduce them to a more manageable 2 principal components which accurately captured the social progress and public values. Li et al. (2012), studied 11 electronics companies’ social sustainability performance and identified potential indicators, and used the PCA test to reduce them to 4 principal components. Martins et al. (2021), performed a neighbourhood-level assessment of urban sustainability for different regions in Brazil and identified 14 social sustainability indicators, which were reduced to 5 principal components. Asbahi et al. (2019), favoured the PCA test because unlike conventional method, it does not allocate ad-hoc and subjective weights to different indicat ors. Finally, Cao et al. (2023), used PCA to reduce 13 indicators of higher education quality and sustainability of educational institutions to four principal components. Therefore, this study too aims to enhance the precision and practicality of sustainability evaluations by using PCA to reduce these 9 sustainability indicators to fewer principal components. This will support automobile companies in achieving more transparent and impactful sustainability practices.
Data Preparation
Raw data collected from secondary reports are not always appropriate for making deductions (Amiram et al., 2015; Crucean, 2024; Sabauri and Kvatashidze, 2024; Turkmen, 2016). Therefore, there is a need to process it to have a better understanding of the data. Herein, the process adopted for making data ready for PCA analysis is as follows.
Step 1. Data Loading
Our initial dataset was a cohesive one comprising environmental and social sustainability indicators. This dataset was imported from an Excel file. The data was read into a pandas DataFrame by connecting the Google Colab notebook with Google Drive.
Step 2. Data Modification
Before proceeding with PCA analysis, ensuring that all the data is numerical. So, this can be done by identifying the datatypes. The .dtypes attribute will return a Series with the data types of each column.
Step 3. Splitting the Data into ‘Social’ and ‘Environmental’ DataFrames
Given that the focus is on social indicators, the data was split into two DataFrames:
- Social DataFrame: Contains columns related to social indicators as specified in the indicators table above.
- Environmental DataFrame: Contains columns related to environmental indicators as mentioned in the indicators table above.
Step 4. Data Cleaning
PCA requires that missing values be handled, as it cannot be performed on datasets with NaN values. So, missing values need to be eliminated using dropna() method. However when values are eliminated it is observed that only 5 observations are left because data for hired, independent directors in board and total wages was not available. This reduces the efficiency of the data. So, to handle this remove all these indicators from the model. Use drop() to specify the columns to remove by passing their names as arguments.
Normality of the dataset
Principal Component Analysis (PCA) is a method to simplify complex data while retaining important information. However, if the data is not normalized first, features measured in different units or ranges can confuse PCA.
If one feature ranges from 1 to 10 and another from 1 to 1,000,000, the second feature will dominate the PCA results simply because of its larger numbers. Normalization is like putting all measurements on the same scale so they can be compared fairly.
Data normalization ensures that each feature contributes equally to the analysis, enabling Principal Component Analysis (PCA) to accurately identify patterns within the data without being influenced by varying scales among features. Herein, gender diversity, board diversity and retention rate are in percentage while LTIFR and total trained employees or workers are in number so to prevent data from being sensitive to scale, this normalization is needed. Applying the MinMaxScaler, a new dataframe was created to store normalized data i.e. social_df_normalized.
PCA modelling
Following data normalization, Principal Component Analysis (PCA) will be applied to reduce dimensionality and capture variance. For this analysis, only one principal component will be retained. This approach is adopted because all selected indicators for social sustainability assessment are being evaluated to determine their relevance in measuring the social sustainability aspect. Therefore, the number of components will be specified as one.
With this specification, the PCA model is fitted on the normalized data, and the results are derived. The PCA-applied data frame demonstrates a reduction in dimensionality from 14 observations and 5 columns to 14 observations and 1 dimension, representing social sustainability.
The explained variance ratio of the first principal component (PC1) is 33.67%. This indicates that the single component developed to represent social sustainability explains a substantial portion (33.67%) of the total variance in the indicators. By focusing on one component, the most significant trends are captured without adding complexity.

Finally, factor loadings are derived to reveal the contribution of each indicator to the social sustainability component. These loadings indicate the strength and direction of the relationship between the original social sustainability indicators and the newly formed principal component. A higher absolute value of a factor loading suggests a more significant contribution of that particular indicator to the principal component, thereby highlighting its importance in defining social sustainability within the context of the Indian automobile sector. This analysis allows for the identification of the most influential indicators, enabling the industry to prioritize its social sustainability efforts effectively.

The above figure shows that:
- Gender diversity has a high factor loading of 0.58, indicating a strong contribution to the principal component. This suggests that gender diversity is a key determinant of social sustainability in the dataset.
- LTIFR has the highest factor loading i.e. 0.63, meaning it has the strongest influence on the principal component. This implies that workplace safety, as measured by LTIFR, is critical in defining social sustainability.
- The retention rate has a moderate loading of 0.31, showing a noticeable but less significant contribution to the principal component. It plays a role in social sustainability but is not as impactful as gender diversity or LTIFR.
- Board diversity also has a moderate contribution of 0.39, slightly higher than the retention rate. This indicates that diversity in leadership impacts social sustainability but is secondary to workplace safety and gender diversity.
- Total Trained Employees or Workers has the lowest factor loading i.e. 0.14, suggesting a minimal contribution to the principal component. While training is relevant, it is not as impactful in defining social sustainability in this dataset.
Based on the trend in factor loading, approximately 0.3 is selected as the criteria for identifying the most relevant indicators for measuring the social sustainability indicator. The indicators with the most substantial loadings, contributing significantly to the social sustainability score, are gender diversity, LTIFR, retention rate, and board diversity. The selection of these indicators allows for a focused approach to social sustainability improvement within the Indian automobile sector.
LTIFR & gender diversity are influential indicators of sustainability
By employing Principal Component Analysis (PCA), this study identified and prioritized key indicators of social sustainability, offering actionable insights for the automotive sector. The analysis highlights that workplace safety (LTIFR) and gender diversity are the most influential indicators, followed by retention rate and board diversity, which also play important roles. In contrast, employee training metrics show a relatively minimal contribution in this dataset, suggesting a need for further exploration. Past research has also shown that LTIFR is an important social sustainability indicator, representing occupational health and safety and hazard prevention techniques in the workplace (Koskela, 2014; Roca & Searcy, 2012). Similarly, (Arayakarnkul et al., 2022; Galdiero et al., 2024; Grosser, 2009; Lin & Efranto, 2023) find gender diversity to be an important contributor to broader sustainability goals.
The findings emphasized the importance of aligning organizational strategies with these critical indicators to improve social sustainability performance. By focusing on these high-impact metrics, automotive companies can enhance their contribution to the UN Sustainable Development Goals while simultaneously strengthening their brand value. The application of PCA not only simplifies the analysis of complex sustainability datasets but also ensures precision in identifying impactful variables. As the automotive sector continues to adopt sustainable practices, incorporating the identified indicators into strategy and reporting frameworks will be pivotal for fostering long-term social and environmental well-being.
References
- Helman, J., Rosienkiewicz, M., Cholewa, M., Molasy, M., & Oleszek, S. (2023). Towards GreenPLM —Key Sustainable Indicators Selection and Assessment Method Development. Energies , 16 (3), 1137. https://doi.org/10.3390/en16031137
- Husgafvel, R. (2021). Exploring Social Sustainability Handprint—Part 2: Sustainable Development and Sustainability. Sustainability , 13 (19), 11051. https://doi.org/10.3390/su131911051
- Jolliffe, I. T., & Cadima, J. (2016). Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society a Mathematical Physical and Engineering Sciences , 374 (2065), 20150202. https://doi.org/10.1098/rsta.2015.0202
- Lukin, E., Krajnovi?, A., & Bosna, J. (2022). Sustainability Strategies and Achieving SDGs: A comparative analysis of leading companies in the automotive industry. Sustainability , 14 (7), 4000. https://doi.org/10.3390/su14074000
- Ly, A. M., & Cope, M. R. (2023). New Conceptual Model of Social Sustainability: Review from Past Concepts and Ideas. International Journal of Environmental Research and Public Health , 20 (7), 5350. https://doi.org/10.3390/ijerph20075350
- Rajesh, R. (2020). Exploring the sustainability performances of firms using environmental, social, and governance scores. Journal of Cleaner Production , 247. https://doi.org/10.1016/j.jclepro.2019.119600
- Vijayakumar, A., Mahmood, M. N., Gurmu, A., Kamardeen, I., & Alam, S. (2022). Social sustainability indicators for road infrastructure projects: A systematic literature review. IOP Conference Series Earth and Environmental Science , 1101 (2), 022039. https://doi.org/10.1088/1755-1315/1101/2/022039
- Amiram, D., Bozanic, Z., Rouen, E., 2015. Financial statement errors: evidence from the distributional properties of financial statement numbers. Rev. Account. Stud. 20, 1540–1593. https://doi.org/10.1007/s11142-015-9333-z
- Arayakarnkul, P., Chatjuthamard, P., Treepongkaruna, S., 2022. Board gender diversity, corporate social commitment and sustainability. Corp. Soc. Responsib. Environ. Manag. 29, 1706–1721. https://doi.org/10.1002/csr.2320
- Asbahi, A.A.M.H.A., Gang, F.Z., Iqbal, W., Abass, Q., Mohsin, M., Iram, R., 2019. Novel approach of Principal Component Analysis method to assess the national energy performance via Energy Trilemma Index. Energy Rep. 5, 704–713. https://doi.org/10.1016/j.egyr.2019.06.009
- Cao, C., Wei, T., Xu, S., Su, F., Fang, H., 2023. Comprehensive evaluation of higher education systems using indicators: PCA and EWM methods. Humanit. Soc. Sci. Commun. 10, 1–12. https://doi.org/10.1057/s41599-023-01938-x
- Crucean, A., 2024. (PDF) The influence of the risk of fraud and accounting errors in the financial statements on the opinion of the financial auditor. Eur. J. Account. Finance Bus. 7, 2019. https://doi.org/10.4316/EJAFB.2019.711
- Galdiero, C., Maltempo, C., Marrapodi, R., Martinez, M., 2024. Gender Diversity: An Opportunity for Socially Inclusive Human Resource Management Policies for Organizational Sustainability. Soc. Sci. 13, 173. https://doi.org/10.3390/socsci13030173
- Gan, X., Fernandez, I.C., Guo, J., Wilson, M., Zhao, Y., Zhou, B., Wu, J., 2017. When to use what: Methods for weighting and aggregating sustainability indicators. Ecol. Indic. 81, 491–502. https://doi.org/10.1016/j.ecolind.2017.05.068
- Grosser, K., 2009. Corporate social responsibility and gender equality: women as stakeholders and the European Union sustainability strategy. Bus. Ethics Eur. Rev. 18, 290–307. https://doi.org/10.1111/j.1467-8608.2009.01564.x
- Koskela, M., 2014. Occupational health and safety in corporate social responsibility reports. Saf. Sci. 68, 294–308. https://doi.org/10.1016/j.ssci.2014.04.011
- Li, T., Zhang, H., Yuan, C., Liu, Z., Fan, C., 2012. A PCA-based method for construction of composite sustainability indicators. Int J Life Cycle Assess 17.
- Lin, C.J., Efranto, R.Y., 2023. Do Age and Gender Change the Perception of Workplace Social Sustainability? Sustainability 15, 5013. https://doi.org/10.3390/su15065013
- Lisowski, S., Berger, M., Caspers, J., Mayr-Rauch, K., Bäuml, G., Finkbeiner, M., 2020. Criteria-Based Approach to Select Relevant Environmental SDG Indicators for the Automobile Industry. Sustainability 12, 8811. https://doi.org/10.3390/su12218811
- Martins, M.S., Kalil, R.M.L., Rosa, F.D., 2021. Sustainable neighbourhoods: applicable indicators through principal component analysis. Proc. Inst. Civ. Eng. – Urban Des. Plan. 174, 25–36. https://doi.org/10.1680/jurdp.20.00058
- Munier, N., 2011. Methodology to select a set of urban sustainability indicators to measure the state of the city, and performance assessment. Ecol. Indic. 11, 1020–1026. https://doi.org/10.1016/j.ecolind.2011.01.006
- Nathan, H.S.K., Reddy, B.S., 2011. Criteria selection framework for sustainable development indicators. Int. J. Multicriteria Decis. Mak. 1, 257. https://doi.org/10.1504/IJMCDM.2011.041189
- Rajesh, R., 2020. Exploring the sustainability performances of firms using environmental, social, and governance scores. J. Clean. Prod. 247, 119600. https://doi.org/10.1016/j.jclepro.2019.119600
- Roca, L.C., Searcy, C., 2012. An analysis of indicators disclosed in corporate sustainability reports. J. Clean. Prod. 20, 103–118. https://doi.org/10.1016/j.jclepro.2011.08.002
- Sabauri, L., Kvatashidze, N., 2024. Sustainability reporting issues. J. Entrep. Sustain. 11. https://doi.org/10.9770/jesi.2023.11.2(19)
- Turkmen, B., 2016. Errors and Abuses in Financial Accounting and Results. Procedia Econ. Finance 38. https://doi.org/10.1016/S2212-5671(16)30179-4
- Vijayakumar, A., Mahmood, M., Gurmu, A., Kamardeen, I., Alam, S., 2022. Social sustainability indicators for road infrastructure projects: A systematic literature review, in: IOP Conf. Series: Earth and Environmental Science. Presented at the World Building Congress, IOP, Montreal.
- Wang, B., Chen, T., 2022. Social Progress beyond GDP: A Principal Component Analysis (PCA) of GDP and Twelve Alternative Indicators. Sustainability 14, 6430. https://doi.org/10.3390/su14116430
- Zhao, G., 2015. A sustainability classification for a country based on PCA. Presented at the International Symposium on Social Science, Atlantis Press, Baoding.
I am an interdisciplinary educator, researcher, and technologist with over a decade of experience in applied coding, educational design, and research mentorship in fields spanning management, marketing, behavioral science, machine learning, and natural language processing. I specialize in simplifying complex topics such as sentiment analysis, adaptive assessments and data visualizatiion. My training approach emphasizes real-world application, clear interpretation of results and the integration of data mining, processing, and modeling techniques to drive informed strategies across academic and industry domains.
I am a management graduate with specialisation in Marketing and Finance. I have over 12 years' experience in research and analysis. This includes fundamental and applied research in the domains of management and social sciences. I am well versed with academic research principles. Over the years i have developed a mastery in different types of data analysis on different applications like SPSS, Amos, and NVIVO. My expertise lies in inferring the findings and creating actionable strategies based on them.
Over the past decade I have also built a profile as a researcher on Project Guru's Knowledge Tank division. I have penned over 200 articles that have earned me 400+ citations so far. My Google Scholar profile can be accessed here.
I now consult university faculty through Faculty Development Programs (FDPs) on the latest developments in the field of research. I also guide individual researchers on how they can commercialise their inventions or research findings. Other developments im actively involved in at Project Guru include strengthening the "Publish" division as a bridge between industry and academia by bringing together experienced research persons, learners, and practitioners to collaboratively work on a common goal.
Discuss