About OverGeneDB:

OverGeneDB is a database of natural antisense transcripts (NATs) in Human and Mouse genomes. Specificity of our database is that our collection consists only 5' end(s) overlapping NAT pairs in which both transcripts in pair code proteins. It is a first database that contain an information of the overlap with the alternative 5' end(s) in dozens of Human and Mouse libraries, generated using experimental TSS-Seq procedure (by Tsuchihara et al. 2009).
Wojciech Rosikiewicz, Yutaka Suzuki, Izabela Makałowska; OverGeneDB: a database of 5′ end protein coding overlapping genes in human and mouse genomes, Nucleic Acids Research, gkx948, https://doi.org/10.1093/nar/gkx948
Vitruvian Man was originally drawn by Leonardo da Vinci. Photo located in "Browse" page was taken from Wikipedia.
Vitruvian Mouse was used with David Deen's permission. Original may be found at author's website.

What protocols were used for the data analysis within OverGeneDB:

Genome coordinates of all known RefSeq mRNAs (starting with accession numbers NM_*) [1] were downloaded from UCSC using Table Browser [2] for human GRCh38/hg38 and mouse NCBI37/mm9 genome versions in September 2014 and May 2012, respectively.

[1] O'Leary, N.A., et al., Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res, 2016. 44(D1): p. D733-45.
[2] Karolchik, D., et al., The UCSC Table Browser data retrieval tool. Nucleic Acids Res, 2004. 32(Database issue): p. D493-6.
Transcription start sites coordinates were downloaded from DBTSS database version 9 for human [1] and version 8 for mouse [2]. Human libraries consisted of 73 libraries including 19 adult organs, 5 fetal organs, 7 tissue types cultured under different conditions, resulting in 23 cell lines, as well as 26 libraries of lung adenocarcinoma. Mouse libraries consisted of 6 organs and 4 embryonal samples harvested after 7, 11, 15 and 17 days of growth.

[1] Suzuki, A., et al., DBTSS as an integrative platform for transcriptome, epigenome and genome sequence variation data. Nucleic Acids Res, 2015. 43(Database issue): p. D87-91.
[2] Yamashita, R., et al., DBTSS: DataBase of Transcriptional Start Sites progress report in 2012. Nucleic Acids Res, 2012. 40(Database issue): p. D150-4.
The broadest representative gene coordinates are the coordinates between the distal TSS assigned to a gene in particular library and distal 3' end based among all gene's mRNAs.
Each TSS library was independently scanned for all pairs of transcriptional units, which were located on the opposite DNA strands and partially overlapping at 5’ ends, with at least 1 bp. Results from different libraries were at the end merged in a summary table, which user may view in the Browse page of the OverGeneDB. As a result we have identified, 582 human and 113 mouse gene pairs overlapping in at least one library, with at least one TSS.
Genes may be simultaneously expressed using one or more TSS, what in numerous cases lead to the situation in which genes overlap only with a subset of alternative transcription start sites. To determine to what degree gene is transcribed from the overlapping TSSs, we developed the overlap ratio – OR, which is simply a fraction of the total gene expression assigned to the overlap region. To determine “how much” gene pairs are overlapping, that is to what extent transcripts in gene pair are transcribed from the overlap region, we also developed JoinedOR, which is a product of the OR values, of gene’s in overlapping pair. OR and JoinedOR values equal to 1 indicate that all transcripts originated from the overlapping TSS in gene and pair respectively. The lower the values get, the smaller subset of transcripts originated from the overlap region, down to 0, which indicate no expression assigned to the overlapping transcription start sites.
We used human GRCh38/hg38 and mouse NCBI37/mm9 genome versions.
We have created an appropriate GitHub repository that may be found here. This repository contains the most essential scripts for the OverGeneDB, that is automated overlapping gene pairs identification method. Based on the raw data from DBTSS and RefSeq, provided script generates the list of overlapping gene pairs and assigned OR ratios for the studied library. Repository was also provided with an example input and output files as well as ReadMe file.