While studying ecological patterns at large scales, ecologists are often unable to identify all collections, forcing them to either omit these unidentified records entirely, without knowing the effect of this, or pursue very costly and time-consuming efforts for identifying them. These "indets" may be of critical importance, but as yet, their impact on the reliability of ecological analyses is poorly known. We investigated the consequence of omitting the unidentified records and provide an explanation for the results. We used three large-scale independent datasets, (Guyana/ Suriname, French Guiana, Ecuador) each consisting of records having been identified to a valid species name (identified morpho-species - IMS) and a number of unidentified records (unidentified morpho-species - UMS). A subset was created for each dataset containing only the IMS, which was compared with the complete dataset containing all morpho-species (AMS: = IMS + UMS) for the following analyses: species diversity (Fisher's alpha), similarity of species composition, Mantel test and ordination (NMDS). In addition, we also simulated an even larger number of unidentified records for all three datasets and analyzed the agreement between similarities again with these simulated datasets. For all analyses, results were extremely similar when using the complete datasets or the truncated subsets. IMS predicted ≥91% of the variation in AMS in all tests/analyses. Even when simulating a larger fraction of UMS, IMS predicted the results for AMS rather well. Using only IMS also out-performed using higher taxon data (genus-level identification) for similarity analyses. Finding a high congruence for all analyses when using IMS rather than AMS suggests that patterns of similarity and composition are very robust. In other words, having a large number of unidentified species in a dataset may not affect our conclusions as much as is often thought. By using three often used macro-ecological analyses, we show that omitting unidentified species from datasets does not affect our conclusions as much as is often thought. Results were extremely similar using either the complete dataset or only named taxa on species level. We also show that using a higher taxon-level approach, similarity values are much more deviating from what was expected based on the all taxa dataset in comparison to using only the named dataset.