Create Unique and Valid Identifiers in Java
Generating valid and unique identifiers is a fundamental requirement in software systems. Identifiers are used to uniquely represent entities such as users, transactions, sessions, database records, and distributed messages. A well-designed identifier strategy ensures correctness, avoids collisions, and scales efficiently as the system grows. Let us delve into understanding how Java create unique random identifiers.
1. Overview
A unique identifier must satisfy two key properties to be effective and reliable in real-world systems:
- Uniqueness: No two identifiers should collide within the defined scope, whether that scope is a single JVM, a database table, or a globally distributed system. Collisions can lead to data corruption, overwrites, or incorrect entity resolution, making strong uniqueness guarantees essential.
- Validity: The identifier should conform to predefined formats, length constraints, and character rules so that it can be consistently stored, indexed, transmitted, and validated across different components and services.
Common approaches include random-based identifiers, time-based identifiers, centralized counters, and hybrid strategies, each offering different trade-offs in terms of scalability, ordering, fault tolerance, and implementation complexity.
2. Java-Based Random Identifier Generation
One of the simplest and most widely used approaches in Java is generating identifiers using random values. Java provides built-in utilities that make this both easy and reliable.
// RandomIdentifierGenerator.java
package jcg.example;
import java.security.SecureRandom;
import java.util.UUID;
public class RandomIdentifierGenerator {
private static final SecureRandom SECURE_RANDOM = new SecureRandom();
private static final String ALPHANUMERIC =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
// UUID-based identifier
public static String generateUUID() {
return UUID.randomUUID().toString();
}
// Secure random alphanumeric identifier
public static String generateRandomId(int length) {
StringBuilder id = new StringBuilder(length);
for (int i = 0; i < length; i++) {
int index = SECURE_RANDOM.nextInt(ALPHANUMERIC.length());
id.append(ALPHANUMERIC.charAt(index));
}
return id.toString();
}
public static void main(String[] args) {
System.out.println("UUID Identifier:");
System.out.println(generateUUID());
System.out.println("\nSecure Random Identifier (20 chars):");
System.out.println(generateRandomId(20));
}
}
2.1 Code Explanation
This Java class demonstrates two common approaches for generating unique identifiers using only standard libraries. The SecureRandom instance is initialized once as a static field to provide cryptographically strong random numbers, which are safer than basic pseudo-random generators for identifiers that may be exposed externally. The ALPHANUMERIC constant defines the allowed character set for human-readable IDs. The generateUUID() method leverages UUID.randomUUID() to produce a 128-bit universally unique identifier with an extremely low probability of collision, making it suitable for distributed systems. The generateRandomId(int length) method creates a fixed-length alphanumeric identifier by iterating for the requested length, randomly selecting characters from the predefined character set, and appending them to a StringBuilder for efficiency. Finally, the main method demonstrates usage by printing a UUID-based identifier and a secure random identifier of 20 characters, showing how both approaches can be used interchangeably depending on requirements such as readability, size, and ordering.
2.2 Code Output
The output below shows a sample UUID value and a secure random alphanumeric identifier, illustrating two different formats of unique identifiers generated by the program.
UUID Identifier: c3d2d58a-ec7c-4d5a-9c3e-4a2f2c92c6b8 Secure Random Identifier (20 chars): Aq9LxP0MZb2H8S1FQWYr
3. Time-Based Identifier Generation Using Snowflake
// SnowflakeIdGenerator.java
package jcg.example;
public class SnowflakeIdGenerator {
// Bit allocation
private static final long EPOCH = 1700000000000L; // custom epoch
private static final long MACHINE_ID_BITS = 10L;
private static final long SEQUENCE_BITS = 12L;
private static final long MAX_MACHINE_ID = (1L << MACHINE_ID_BITS) - 1;
private static final long MAX_SEQUENCE = (1L << SEQUENCE_BITS) - 1;
private final long machineId;
private long lastTimestamp = -1L;
private long sequence = 0L;
public SnowflakeIdGenerator(long machineId) {
if (machineId < 0 || machineId > MAX_MACHINE_ID) {
throw new IllegalArgumentException("Invalid machine id");
}
this.machineId = machineId;
}
public synchronized long nextId() {
long currentTimestamp = currentTime();
if (currentTimestamp < lastTimestamp) {
throw new IllegalStateException("Clock moved backwards");
}
if (currentTimestamp == lastTimestamp) {
sequence = (sequence + 1) & MAX_SEQUENCE;
if (sequence == 0) {
currentTimestamp = waitNextMillis(currentTimestamp);
}
} else {
sequence = 0;
}
lastTimestamp = currentTimestamp;
return ((currentTimestamp - EPOCH) << (MACHINE_ID_BITS + SEQUENCE_BITS))
| (machineId << SEQUENCE_BITS)
| sequence;
}
private long waitNextMillis(long currentTimestamp) {
while (currentTimestamp == lastTimestamp) {
currentTimestamp = currentTime();
}
return currentTimestamp;
}
private long currentTime() {
return System.currentTimeMillis();
}
public static void main(String[] args) {
SnowflakeIdGenerator generator = new SnowflakeIdGenerator(1);
for (int i = 0; i < 5; i++) {
System.out.println(generator.nextId());
}
}
}
3.1 Code Explanation
This class implements a Snowflake-style distributed ID generator that produces unique, time-ordered 64-bit identifiers without relying on a database. The ID structure is composed using bit allocation, where a custom EPOCH reduces the timestamp size, MACHINE_ID_BITS identify the node or instance generating the ID, and SEQUENCE_BITS allow multiple IDs to be generated within the same millisecond. The maximum values for machine ID and sequence are calculated using bit shifts to enforce valid ranges. Each generator instance is initialized with a unique machineId, ensuring cross-node uniqueness. The synchronized nextId() method guarantees thread safety while generating IDs by first obtaining the current system time, validating that the clock has not moved backwards, and then incrementing a sequence counter if multiple IDs are requested in the same millisecond. If the sequence overflows, the generator blocks until the next millisecond is reached. Finally, the ID is assembled by left-shifting the timestamp, machine ID, and sequence into their respective bit positions and combining them using bitwise OR operations. The main method demonstrates sequential ID generation, showing that the resulting values are strictly increasing, globally unique, and suitable for high-throughput distributed systems.
3.2 Code Output
The following output demonstrates a sequence of Snowflake-generated identifiers, showing that each value is unique and monotonically increasing.
573822145812930560 573822145812930561 573822145812930562 573822145812930563 573822145812930564
4. Performance
Performance characteristics of identifier generation strategies differ significantly and should be evaluated in the context of system scale, concurrency, and deployment topology. While identifier generation is often perceived as a trivial operation, at high throughput it can directly impact latency, contention, and overall system stability.
- UUID and random-based identifiers are typically very fast because they are generated entirely in-memory and do not require any shared state or external coordination. Methods such as
UUID.randomUUID()andSecureRandom-based generators scale well across threads and nodes, making them suitable for stateless microservices. However, the lack of natural ordering can negatively affect database index locality, leading to increased index fragmentation and reduced write performance in large tables. - Database-backed counters or auto-increment columns are simple to use but can become a performance bottleneck under high concurrency. Since each identifier generation requires a round-trip to the database and often a serialized increment operation, throughput is limited by database locking and transaction contention. This approach also introduces tight coupling to a single data source, reducing scalability and fault tolerance in distributed systems.
- Time-based and Snowflake-style identifiers offer an excellent balance between performance and ordering guarantees. By generating identifiers locally using system time, machine identifiers, and sequence numbers, they avoid network calls and can produce thousands of IDs per millisecond per node. Their time-ordered nature improves database index locality and write efficiency. The trade-off is increased implementation complexity and the need for careful handling of clock drift, machine ID assignment, and sequence overflow.
In practice, UUIDs and secure random identifiers are sufficient for most applications, while Snowflake-style generators are preferred in high-throughput, distributed environments where ordering, scalability, and database performance are critical.
5. Conclusion
Generating valid and unique identifiers is a core architectural concern. Plain-Java random generators, especially UUIDs and secure random strings, offer a robust and easy-to-implement solution for many use cases. However, system scale, ordering requirements, and performance constraints should guide the final choice of identifier strategy.

