Population geneticists can track lineages through two paths: Y chromosome that is inherited from the paternal gene set and the mitochondrial DNA that is inherited from the maternal side. This controversial question has had a large number of people looking into it and genetic research has been finding new results until as recent as 2018, such as the one titled ‘The Genomic Formation of South and Central Asia’.
Source: https://openthemagazine.com/cover-story/the-origins/
Lhendup G Bhutia The origins, cover story
Human genome sequencing has shown that the Y chromosome haplogroup R1a is found across most of the Indo-European language speaking areas of the globe. This subclade or subgroup branches into two groups - R1a-Z282 only found in Europe and R1a-Z93 in south and central regions of asia. This second branch is in most of the lines of indian descent. However, the earliest traces of it are found in Ukraine around the 5000 BCE to 3500 BCE time period, which suggests that both these branches of the y chromosome haplogroup originated in that area and that the corresponding descendants having these two subgroups had ancestors from this area.
The connection between this haplogroup and indo european language speaking communities is supported by the fact that it is found in higher concentrations in upper castes associated with passing on sanskrit language- (one of the earliest documented indo european languages.
The present population of India is known to be a mixture of ANI(Ancestral North Indians) and ASI(ANcestral South Indians)
Steppe pastoralists from kazakh region moved southwards and interacted with existing Harappans and first Indians to form ANI and ASI respectively. ANI are the result of intermingling between the Harappans and these steppe pastoralists while ASI are the result of intermingling between the Harappans and the First Indians.These steppe pastoralists that are known to have spoken Proto-Indo European language, are called the yamnaya.
Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6822619/
The Formation of Human Populations in South and Central Asia. Narasimhan et. al 2019
The Yamnaya had been influenced by contemporary communities such as the Maikop that had a role in developing use of the wheel, wagon and the horse. The domestication of the horse mobilized the yamnaya to such an extent that the only hard evidence of their existence is their ‘kurgans’(burial mounds), where their horses have been found buried with them. These were seen for the first time in the Caucasus, suggesting that the people of this region could have spoken the Proto-Indo-IndoEuropean language before the yamnaya.
Sources:
The formation of human populations in South and Central Asia (science.org)
An Ancient Harappan Genome Lacks Ancestry from Steppe Pastoralists or Iranian Farmers (cell.com)