Creating a Java-based Remote Framebuffer Server

In the realm of computing, Virtual Network Computing, or VNC, stands out as a graphical desktop sharing system that leverages the Remote Framebuffer (RFB) protocol to enable remote control of a separate computer. Its functionality involves transmitting keyboard and mouse events from the originating computer to the target machine while simultaneously relaying graphical screen updates back in the opposite direction across a network connection.

RFB, in essence, is a straightforward protocol designed for remote access to graphical user interfaces. Operating at the frame buffer level, it demonstrates versatility across various windowing systems and applications, including prominent ones like Microsoft Windows, Mac OS X, and the X Window System.

Building a Remote Framebuffer server-side protocol powered Swing application in Java
Building a Remote Framebuffer server-side protocol powered Swing application in Java

This article aims to guide you through implementing the RFB server-side protocol. We’ll illustrate this with a compact Java Swing application that showcases the transmission of the main window over a TCP connection to VNC viewers. The objective here is to demonstrate the fundamental features of the protocol and a potential implementation using Java.

A prerequisite for this article is a foundational understanding of the Java programming language and familiarity with basic concepts related to TCP/IP networking, the client-server model, and similar principles. Ideally, the reader would be a Java developer with some prior experience in well-known VNC implementations such as RealVNC, UltraVNC, or TightVNC.

Decoding the Remote Framebuffer Protocol Specification

The RFB protocol specification is relatively well defined. As per Wikipedia, the RFB protocol has gone through several iterations. However, our primary focus in this article will be on common messages that most VNC implementations should be able to interpret correctly, irrespective of the protocol version.

When a VNC viewer (acting as the client) establishes a TCP connection with a VNC server (running the RFB service), the initial phase involves exchanging protocol versions:

1
2
RFB Service    -----------  "RFB 003.003\n"  -------> VNC viewer
RFB Service    <----------  "RFB 003.008\n"  -------- VNC viewer

This exchange takes the form of a simple stream of bytes, which can be decoded into ASCII characters, typically resembling “RFB 003.008\n”.

Following the version exchange, the next stage is authentication. The VNC server transmits an array of bytes to signal the types of authentication it supports. For instance:

1
2
RFB Service    -----------  0x01 0x02 -----------> VNC viewer
RFB Service    <-----------  0x02  -----------     VNC viewer

In this scenario, the VNC server presents only one possible authentication type (0x02). The leading byte, 0x01, signifies the number of available authentication types. The VNC viewer is obligated to respond with the value 0x02, as it represents the sole authentication type supported by the server in this example.

Subsequently, the server issues an authentication challenge (the specifics of which vary based on the algorithm in use), and the client must respond with an appropriate challenge response message. Following this, the client awaits confirmation from the server. Once the client is successfully authenticated, it can proceed with establishing the session.

For simplicity, one can opt for no authentication whatsoever. It’s important to note that the RFB protocol, by its nature, is inherently insecure, regardless of the authentication mechanism employed. Should security be a primary concern, the recommended approach would be to tunnel RFB sessions through secure channels like VPN or SSH connections.

At this juncture, the VNC viewer sends a “shared desktop” message, indicating whether the client intends to share the session and permit other VNC viewers to connect to the same desktop. The RFB service implementation then processes this message, potentially restricting multiple VNC viewers from sharing a single screen. This message is concise, comprising only a single byte, and the valid values are either 0x00 or 0x01.

Finally, the RFB server dispatches a “server init” message containing crucial information about the screen: its dimensions, bits per pixel, depth, endianness (big endian flag), true color flags, maximum values for red, green, and blue colors, bit positions for these colors within a pixel, and a desktop string or title. The initial two bytes represent the screen width in pixels, followed by two bytes for screen height. After the screen height bytes, a single byte represents the bits per pixel. This value is typically 8, 16, or 32. On most modern systems boasting a full color range, this byte holds the value 32 (0x20), signaling to the client that it can request the full color palette for each pixel from the server. The “big endian” byte is non-zero only if the pixels are arranged in big-endian order. If the “true color” byte is non-zero (true), the subsequent six bytes detail how to extract red, green, and blue color intensities from the pixel value. The following six bytes represent the maximum allowed values for the red, green, and blue components of a pixel. This is particularly relevant in 8-bit color mode, where only a limited number of bits are available for each color component. The red, green, and blue shifts determine the bit positions for each respective color. The final three bytes are padding and should be disregarded by the client. Following the pixel format, a single byte specifies the length of the desktop title string. The desktop title itself is an ASCII-encoded string represented as a byte array of variable length.

Remote Framebuffer server-client protocol: version exchange, authentication and server init message
Remote Framebuffer server-client protocol: version exchange, authentication and server init message

Once the “server init” message is sent, the RFB service should be ready to receive and decode client messages from the socket. There are six primary types of messages:

  • SetPixelFormat
  • SetEncodings
  • FramebufferUpdateRequest
  • KeyEvent
  • PointerEvent
  • ClientCutText

The protocol documentation provides precise explanations for each message type, detailing the purpose of every byte. For example, let’s consider the “server init” message:

No of bytesTypeDescription
2U16framebuffer-width
2U16framebuffer-height
16PIXEL_FORMATserver-pixel-format
4U32name-length
name-lengthU8 arrayname-string

Here, PIXEL_FORMAT is defined as:

No of bytesTypeDescription
1U8bits-per-pixel
1U8depth
1U8big-endian-flag
1U8true-colour-flag
2U16red-max
2U16green-max
2U16blue-max
1U8red-shift
1U8green-shift
1U8blue-shift
3padding

In this context, U16 represents an unsigned 16-bit integer (two bytes), U32 denotes an unsigned 32-bit integer, and U8 array refers to an array of bytes, and so forth.

Bringing the Protocol to Life: Implementation in Java

A typical Java server application consists of a thread dedicated to listening for incoming client connections and multiple threads responsible for managing individual client connections.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
/*
 * Use TCP port 5902 (display :2) as an example to listen.
 */
int port = 5902;
ServerSocket serverSocket;
serverSocket = new ServerSocket(port);

/*
 * Limit sessions to 100. This is lazy way, if 
 * somebody really open 100 sessions, server socket
 * will stop listening and no new VNC viewers will be 
 * able to connect.
 */
while (rfbClientList.size() < 100) {
	
	/*
	 * Wait and accept new client.
	 */
	Socket client = serverSocket.accept();
	
	/*
	 * Create new object for each client.
	 */
	RFBService rfbService = new RFBService(client);
	
	/*
	 * Add it to list.
	 */
	rfbClientList.add(rfbService);
	
	/*
	 * Handle new client session in separate thread.
	 */
	(new Thread(rfbService, "RFBService" + rfbClientList.size())).start();
	
}

In this code snippet, TCP port 5902 (corresponding to display :2) is chosen. The while loop patiently waits for a client to establish a connection. The ServerSocket.accept() method operates in a blocking manner, causing the thread to halt execution until a new client connection is established. Upon a successful client connection, a new thread, RFBService, is created to handle the RFB protocol messages received from that particular client.

The RFBService class implements the Runnable interface and is equipped with methods for reading bytes from the socket. The run() method plays a crucial role, as it is executed immediately when the thread is started at the end of the loop:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
@Override
public void run() {
	
	try {

		/*
		 * RFB server has to send protocol version string first.
		 * And wait for VNC viewer to replay with 
* protocol version string.
		 */
		sendProtocolVersion();
		String protocolVer = readProtocolVersion();
		if (!protocolVer.startsWith("RFB")) {
			throw new IOException();
		}

Here, the sendProtocolVersion() method transmits the RFB version string to the client (VNC viewer) and then waits to read the protocol version string sent back from the client. The client is expected to reply with a string similar to “RFB 003.008\n”. The readProtocolVersion() method, like other methods prefixed with “read,” operates in a blocking manner.

1
2
3
4
private String readProtocolVersion() throws IOException {
	byte[] buffer = readU8Array(12);
	return new String(buffer);
}

The readProtocolVersion() method is straightforward: it reads 12 bytes from the socket and returns the data as a string. The readU8Array(int) function is responsible for reading the specified number of bytes, which in this case is 12 bytes. If there aren’t enough bytes available to read from the socket, it patiently waits:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
private byte[] readU8Array(int len) throws IOException {
	byte[] buffer = new byte[len];
	int offset = 0, left = buffer.length;
	while (offset < buffer.length) {
		int numOfBytesRead = 0;
		numOfBytesRead = in.read(buffer, offset, left);
		offset = offset + numOfBytesRead;
		left = left - numOfBytesRead;
	}
	return buffer;		
}

Similar to readU8Array(int), we have methods like readU16int() and readU32int() that read bytes from the socket and return an integer value.

After the exchange of protocol versions, the RFB service proceeds to send the security message:

1
2
3
4
5
6
7
/*
 * RFB server sends security type bytes that may request 
 * a user to type password.
 * In this implementation, this is set to simples 
 * possible option: no authentication at all.
 */
sendSecurityType();

In this implementation, we’ve opted for the simplest approach: no password is required from the VNC client.

1
2
3
4
private void sendSecurityType() throws IOException {
	out.write(SECURITY_TYPE);
	out.flush();
}

Here, SECURITY_TYPE is a byte array defined as follows:

1
private final byte[] SECURITY_TYPE = {0x00, 0x00, 0x00, 0x01};

This specific sequence of bytes, as per the RFB protocol version 3.3, signifies to the VNC viewer that it doesn’t need to provide any password.

The next piece of information the RFB service expects from the client is the “shared desktop” flag, which is a single byte transmitted over the socket.

1
2
3
4
5
6
7
/*
 * RFB server reads shared desktop flag. It's a single 
 * byte that tells RFB server
 * should it support multiple VNC viewers connected at 
 * same time or not. 
 */
byte sharedDesktop = readSharedDesktop();

While we read the “shared desktop” flag from the socket, our current implementation chooses to ignore it.

Next, the RFB service is responsible for sending the “server init” message:

1
2
3
4
5
6
7
8
9
/*
 * RFB server sends ServerInit message that includes 
 * screen resolution,
 * number of colors, depth, screen title, etc.
 */
screenWidth = JFrameMainWindow.jFrameMainWindow.getWidth();
screenHeight = JFrameMainWindow.jFrameMainWindow.getHeight();
String windowTitle = JFrameMainWindow.jFrameMainWindow.getTitle();
sendServerInit(screenWidth, screenHeight, windowTitle);			

In our demo, JFrameMainWindow is a JFrame serving as the source of graphics. The “server init” message mandates the inclusion of the screen width and height in pixels, along with the desktop title. In this example, we’re using the JFrame’s title, obtained using the getTitle() method.

Following the “server init” message, the RFB service thread enters a loop where it continuously reads six types of messages from the socket:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
/*
 * Main loop where clients messages are read from socket.
 */
while (true) {

	/*
	 * Mark first byte and read it.
	 */
	in.mark(1);
	int messageType = in.read();
	if (messageType == -1) {
		break;
	}
	/*
	 * Go one byte back.
	 */
	in.reset();
	
	/*
	 * Depending on message type, read complete message on socket.
	 */
	if (messageType == 0) {
		/*
		 * Set Pixel Format
		 */
		readSetPixelFormat();
	}
	else if (messageType == 2) {
		/*
		 * Set Encodings
		 */
		readSetEncoding();
	}
	else if (messageType == 3) {
		/*
		 * Frame Buffer Update Request
		 */
		readFrameBufferUpdateRequest();
	}
	else if (messageType == 4) {
		/*
		 * Key Event
		 */
		readKeyEvent();
	}
	else if (messageType == 5) {
		/*
		 * Pointer Event
		 */
		readPointerEvent();
	}
	else if (messageType == 6) {
		/*
		 * Client Cut Text
		 */
		readClientCutText();
	}
	else {
		err("Unknown message type. Received message type = " + messageType);
	}
}

Each method—readSetPixelFormat(), readSetEncoding(), readFrameBufferUpdateRequest(), and so on, up to readClientCutText()—operates in a blocking fashion and triggers specific actions based on the message received.

For instance, the readClientCutText() method is responsible for reading the text sent by the VNC viewer when a user performs a “cut” operation on the client side. The text is encoded within the message and transmitted to the server via the RFB protocol. Upon receipt, the server places this text into the system clipboard.

Deconstructing Client Messages

The RFB service must support all six types of client messages, at least at the byte level. This means that whenever the client transmits a message, the server is obligated to read the entire message, byte by byte. This is because the RFB protocol is byte-oriented, and there are no delimiters between consecutive messages.

The most crucial message type is the “frame buffer update request,” where the client can request either a full or incremental update of the screen content.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
private void readFrameBufferUpdateRequest() throws IOException {
	
	int messageType = in.read();
	int incremental = in.read();
	
	if (messageType == 0x03) {
		
		int x_pos = readU16int(); 
		int y_pos = readU16int();
		int width = readU16int();
		int height = readU16int();

		screenWidth  = width;
		screenHeight = height;
		
		if (incremental == 0x00) {
							
			incrementalFrameBufferUpdate = false;				
			
			int x = JFrameMainWindow.jFrameMainWindow.getX();
			int y = JFrameMainWindow.jFrameMainWindow.getY();

			RobotScreen.robo.getScreenshot(x, y, width, height); 
			
			sendFrameBufferUpdate(x_pos, y_pos, width, height, 0, RobotScreen.robo.getColorImageBuffer());					
			
			
		}
		else if (incremental == 0x01) {
			
			incrementalFrameBufferUpdate = true;
			
		}
		else {
			throw new IOException();
		}
	}
	else {
		throw new IOException();
	}

}

The first byte of the “frame buffer update request” message indicates the message type, which is always 0x03. The next byte is the “incremental” flag, signaling to the server whether to send the entire frame or just the differences since the last update. In the case of a full update request, the RFB service will capture a screenshot of the main window using the RobotScreen class and transmit it to the client.

Conversely, if the request is for an incremental update, a flag named incrementalFrameBufferUpdate is set to true. Swing components will then consult this flag to determine if they need to send only the portions of the screen that have changed. Typically, components like JMenu, JMenuItem, JTextArea, and others might need to perform incremental screen updates when the user moves the mouse pointer, clicks, types, or interacts in similar ways.

The sendFrameBufferUpdate(int, int, int, int, int[]) method is responsible for sending the image buffer data to the socket.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
public void sendFrameBufferUpdate(int x, int y, int width, int height, int encodingType, int[] screen) throws IOException {
	
	if (x + width > screenWidth || y + height > screenHeight) {
		err ("Invalid frame update size:"); 
		err (" x = " + x + ", y = " + y);
		err (" width = " + width + ", height = " + height);
		return;
	}
	
	byte messageType = 0x00;
	byte padding     = 0x00;
	
	out.write(messageType);
	out.write(padding);
	
	int numberOfRectangles = 1;
	
	writeU16int(numberOfRectangles);	
	
	writeU16int(x);
	writeU16int(y);
	writeU16int(width);
	writeU16int(height);
	writeS32int(encodingType);

	for (int rgbValue : screen) {

		int red   = (rgbValue & 0x000000FF);
		int green = (rgbValue & 0x0000FF00) >> 8;
		int blue  = (rgbValue & 0x00FF0000) >> 16;

		if (bits_per_pixel == 8) {
			out.write((byte) colorMap.get8bitPixelValue(red, green, blue));
		}
		else {
			out.write(red);
			out.write(green);
			out.write(blue);
			out.write(0);
		}
	}
	out.flush();
}

This method first ensures that the (x, y) coordinates, along with the width and height of the image buffer, don’t exceed the screen boundaries. The message type value for a “frame buffer update” is 0x00. The “padding” value is typically set to 0x00 and should be ignored by the VNC viewer. The “number of rectangles” is a two-byte value indicating the number of rectangles that follow within the message.

Each rectangle is defined by its upper-left coordinate, width, height, encoding type, and pixel data. The RFB protocol supports several efficient encoding formats, such as ZRLE, Hextile, and Tight. However, for the sake of simplicity and clarity, our implementation utilizes the “raw” encoding format.

Raw encoding transmits pixel color information as RGB components. If the client has set the pixel format to 32-bit, then 4 bytes are transmitted for each pixel. Conversely, if the client is using 8-bit color mode, each pixel is transmitted as a single byte. The provided code snippet demonstrates this within the for loop. Note that in 8-bit mode, a color map is employed to find the best match for each pixel from the screenshot or image buffer. In 32-bit pixel mode, the image buffer holds an array of integers, with each value containing multiplexed RGB components.

A Swinging Demo: Building the Application

Our Swing demo application includes an action listener responsible for triggering the sendFrameBufferUpdate(int, int, int, int, int[]) method. In a typical scenario, application elements like Swing components would have listeners that send screen change notifications to the client. For instance, if a user enters text into a JTextArea, this change should be reflected on the VNC viewer.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
public void actionPerformed(ActionEvent arg0) {

	/*
	 * Get dimensions and location of main JFrame window.
	 */
	int offsetX = JFrameMainWindow.jFrameMainWindow.getX();
	int offsetY = JFrameMainWindow.jFrameMainWindow.getY();

	int width  = JFrameMainWindow.jFrameMainWindow.getWidth();
	int height = JFrameMainWindow.jFrameMainWindow.getHeight();

	/*
	 * Do not update screen if main window dimension has changed.
	 * Upon main window resize, another action listener will
	 * take action.
	 */
	int screenWidth = RFBDemo.rfbClientList.get(0).screenWidth;
	int screenHeight = RFBDemo.rfbClientList.get(0).screenHeight;
	if (width != screenWidth || height != screenHeight) {
			return;
	}
			
	/*
	 * Capture new screenshot into image buffer.
	 */
	RobotScreen.robo.getScreenshot(offsetX, offsetY, width, height);
	
	int[] delta = RobotScreen.robo.getDeltaImageBuffer();         	                	

	if (delta == null) {

			offsetX = 0;
			offsetY = 0;
			
			Iterator<RFBService> it = RFBDemo.rfbClientList.iterator();
			while (it.hasNext()) {

					RFBService rfbClient = it.next();

					if (rfbClient.incrementalFrameBufferUpdate) {

						try {

							/*
							 * Send complete window.
							 */
							rfbClient.sendFrameBufferUpdate(
											offsetX, offsetY,
											width, height,
											0,
										RobotScreen.robo.getColorImageBuffer());
						}
						catch (SocketException ex) {
							it.remove();
						}
						catch (IOException ex) {
							ex.printStackTrace();

							it.remove();
						}

						rfbClient.incrementalFrameBufferUpdate = false;

					}
			}
	}
	else {

			offsetX = RobotScreen.robo.getDeltaX();
			offsetY = RobotScreen.robo.getDeltaY();

			width =  RobotScreen.robo.getDeltaWidth();
			height =  RobotScreen.robo.getDeltaHeight();

			Iterator<RFBService> it = RFBDemo.rfbClientList.iterator();
			while (it.hasNext()) {

					RFBService rfbClient = it.next();

					if (rfbClient.incrementalFrameBufferUpdate) {

						try {
							
							/*
							 * Send only delta rectangle.
							 */
							rfbClient.sendFrameBufferUpdate(
											offsetX, offsetY,
											width, height,
											0,
											delta);

						}
						catch (SocketException ex) {
							it.remove();
						}
						catch (IOException ex) {
							ex.printStackTrace();

							it.remove();
						}

						rfbClient.incrementalFrameBufferUpdate = false;

					}
			}
	}
}

The code for this action listener is quite straightforward. It captures a screenshot of the main window (JFrameMain) using the RobotScreen class and then determines if a partial screen update is necessary. The variable diffUpdateOfScreen acts as a flag for partial updates. Finally, either the complete image buffer or only the rows that have changed are transmitted to the client. Additionally, the code accommodates multiple connected clients; hence the use of an iterator to loop through the client list maintained in the RFBDemo.rfbClientList member.

The “frame buffer update” action listener can be used in conjunction with a Timer, which can be started whenever a JComponent undergoes a change:

1
2
3
4
5
6
/*
 * Define timer for frame buffer update with 400 ms delay and 
* no repeat.
 */
timerUpdateFrameBuffer = new Timer(400, new ActionListenerFrameBufferUpdate());
timerUpdateFrameBuffer.setRepeats(false);

This particular code snippet resides within the constructor of the JFrameMainWindow class. The timer is started by the doIncrementalFrameBufferUpdate() method:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
public void doIncrementalFrameBufferUpdate() {

	if (RFBDemo.rfbClientList.size() == 0) {
		return;
	}

	if (!timerUpdateFrameBuffer.isRunning()) {		
		timerUpdateFrameBuffer.start();
	} 

}

Other action listeners typically invoke the doIncrementalFrameBufferUpdate() method:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
public class DocumentListenerChange implements DocumentListener {

	@Override
	public void changedUpdate(DocumentEvent e) {
		JFrameMainWindow jFrameMainWindow = JFrameMainWindow.jFrameMainWindow;
		jFrameMainWindow.doIncrementalFrameBufferUpdate();		
	}

	// ...

}

This approach is designed for simplicity and ease of understanding. It only requires a reference to the JFrameMainWindow instance and a single call to the doIncrementalFrameBufferUpdate() method. This method checks for active client connections and, if any exist, starts the timerUpdateFrameBuffer timer. Once the timer starts ticking, the action listener will capture a screenshot and execute the sendFrameBufferUpdate() method.

The figure above illustrates the relationship between various listeners and the frame buffer update process. Most listeners are triggered when the user interacts with the application, such as clicking, selecting text, typing in a text area, or performing similar actions. These actions, in turn, invoke the doIncrementalFramebufferUpdate() member function, which subsequently starts the timerUpdateFrameBuffer. This timer is responsible for eventually calling the sendFrameBufferUpdate() method within the RFBService class, ultimately leading to a screen update on the client side (VNC viewer).

Capturing the Screen, Simulating Keystrokes, and Controlling the Mouse Pointer

Java provides a built-in Robot class that empowers developers to create applications capable of capturing screenshots, sending keystrokes, manipulating the mouse pointer, simulating clicks, and more.

To capture the specific area of the screen where the JFrame window is displayed, we utilize the RobotScreen class. The core method here is getScreenshot(int, int, int, int), which captures a rectangular region of the screen. RGB values for each pixel within the captured region are stored in an int[] array:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
   	public void getScreenshot(int x, int y, int width, int height) {
         	
         	Rectangle screenRect = new Rectangle(x, y, width, height);
         	BufferedImage colorImage = robot.createScreenCapture(screenRect);
   	
         	previousImageBuffer = colorImageBuffer;
         	
   		colorImageBuffer = ((DataBufferInt) colorImage.getRaster().getDataBuffer()).getData();
 
   		if (previousImageBuffer == null ||
   		     	previousImageBuffer.length != colorImageBuffer.length) {
   		   previousImageBuffer = colorImageBuffer;
   		}
 
   		this.width = width;
   		this.height = height;
   		
   	}

This method stores pixel data in the colorImageBuffer array. To retrieve this pixel data, we can use the getColorImageBuffer() method.

Furthermore, the method retains a copy of the previous image buffer, enabling us to extract only the pixels that have changed between captures. To obtain just the differences within the captured image area, we can utilize the getDeltaImageBuffer() method.

The Robot class simplifies the process of sending simulated keystrokes to the system. However, it’s important to handle special key codes received from VNC viewers and translate them correctly. The RobotKeyboard class offers the sendKey(int, int) method, which adeptly handles both special keys and alphanumeric keys:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
public void sendKey(int keyCode, int state) {
	switch (keyCode) {
	case 0xff08:
		doType(VK_BACK_SPACE, state);
		break;
	case 0xff09:
		doType(VK_TAB, state);
		break;
	case 0xff0d: case 0xff8d:
		doType(VK_ENTER, state);
		break;
	case 0xff1b:
		doType(VK_ESCAPE, state);
		break;

	case 0xffe1: case 0xffe2:
		doType(VK_SHIFT, state);           	
		break;                	
	case 0xffe3: case 0xffe4:
		doType(VK_CONTROL, state);         	
		break;          	
	case 0xffe9: case 0xffea:
		doType(VK_ALT, state);             	
		break;          	
	default:
		
		/*
		 * Translation of a..z keys.
		 */
		if (keyCode >= 97 && keyCode <= 122) {
			/*
			 * Turn lower-case a..z key codes into upper-case A..Z key codes.
			 */
			keyCode = keyCode - 32;
		}
		
		doType(keyCode, state);

	}
}

The “state” argument determines whether the key is being pressed or released. After appropriately translating the key code into a VT constant, the doType(int, int) method passes the key value to the Robot instance, effectively replicating the action of a local user pressing the corresponding key on the keyboard:

1
2
3
4
5
6
7
8
private void doType(int keyCode, int state) {
   	if (state == 0) {
	  	robot.keyRelease(keyCode);
	}
	else {
		robot.keyPress(keyCode);
	}
}

Similar to RobotKeyboard, we have the RobotMouse class, which handles pointer events and can move the mouse pointer and simulate clicks.

1
2
3
public void mouseMove(int x, int y) {
	robot.mouseMove(x, y);
}

All three classes, RobotScreen, RobotMouse, and RobotKeyboard, create a new Robot instance within their constructors:

1
this.robot = new Robot();

In our application, we only need a single instance of each class, as there’s no need for multiple instances of RobotScreen, RobotMouse, or RobotKeyboard.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
public static void main(String[] args) {
	...
	/*
	* Initialize static Robot objects for screen, keyboard and mouse.
	*/
	 RobotScreen.robo = new RobotScreen();
	 RobotKeyboard.robo = new RobotKeyboard();
	 RobotMouse.robo = new RobotMouse();
	 ...
}	

In this particular demo application, these instances are created within the main() function.

The culmination of our efforts is a Swing-based Java application functioning as an RFB service provider, allowing standard VNC viewers to establish connections:

In Conclusion: The Power and Potential of RFB

The RFB protocol enjoys widespread use and acceptance, with client implementations in the form of VNC viewers readily available for virtually every platform and device. While its primary purpose is to remotely display desktops, its applications extend far beyond this. Developers can leverage RFB to create innovative graphical tools and access them remotely, enhancing existing remote workflows.

This article has provided a comprehensive overview of the RFB protocol, covering its message formats, screen transmission techniques, and methods for handling keyboard and mouse interactions. For those eager to delve deeper, the full source code of the Swing demo application is available on GitHub.

Licensed under CC BY-NC-SA 4.0