|
Terracotta Administrator Console Guide
IntroductionThe Terracotta Administrator Console consists of a comprehensive set of tools for monitoring, debugging, and controlling various aspects of your Terracotta cluster. The console functions as a JMX client with a graphical user interface.
Some features are found only in an enterprise version of the Terracotta Administrator Console. To learn more about the many benefits of an enterprise version of Terracotta, see our [Enterprise Products|enterprise:Products].
Using the console, you can:
These and other console features are described below.
Launching the Terracotta Administrator ConsoleYou can launch the Terracotta Administrator Console from a command line.
Microsoft Windows[PROMPT] %TERRACOTTA_HOME%\bin\admin.bat UNIX/Linux[PROMPT] ${TERRACOTTA_HOME}/bin/admin.sh& What You Will SeeThe console contains a connect/disconnect panel, log window, status line, and a expandable list of Terracotta clusters.
The cluster list may already be populated because of pre-existing references to previously defined Terracotta clusters. These references are maintained as Java properties and persist across sessions and product upgrades. If no clusters have been defined, a default cluster (host=localhost, jmx-port=9520) is created.
Context-Sensitive HelpContext-sensitive help is available wherever Working with ClustersClusters are the highest-level nodes in the expandable cluster list displayed by the Terracotta Administrator Console. A single Terracotta cluster defines a domain of Terracotta servers and clients being clustered by Terracotta. A single Terracotta cluster can have one or more servers and one or more clients. For example, the Terracotta servers in a setup featuring server redundancy, along with their clients, appear under the same cluster. Adding and Removing ClustersTo add a new cluster reference, choose File|New cluster. Or right-click in the cluster list area to open a context menu, then choose New cluster.
The cluster topology is determined from the server specified in the connection panel's Server Host and JMX Port fields. These fields are editable when the console is not connected to the cluster. To remove an existing cluster reference, right-click the cluster in the cluster list to open the context menu and choose Delete. Connecting to a clusterTo connect to an existing cluster, select the cluster node in the cluster list, then click the Connect button in the connect/disconnect panel. You can also connect to a specific cluster by choosing Connect from its context menu. After a successful connection, the cluster node becomes expandable and server connection information appears in the connect/disconnect panel. To automatically connect to a cluster whenever the Terracotta Administration Console starts or when at least one of the cluster's servers is running, enable Auto-connect in the cluster context menu. Automatic connections are attempted in the background and do not interfere with normal console operation. Connecting to a Secured ClusterA Terracotta cluster can be secured for JMX access, requiring authentication before access is granted. Connecting to a secured cluster prompts users to enter a username and password. For instructions on how to secure your Terracotta cluster for JMX, see the Configuration Guide and Reference. Disconnecting from a ClusterTo disconnect from a cluster, click Disconnect on the cluster panel or select Disconnect from the cluster context menu. Working with ServersThe Servers node appears under a connected Terracotta cluster in the cluster list. The Servers panel displays a table of servers in the cluster, including each server's name, hostname or IP address, and JMX port. To view the servers in a cluster, expand the Terracotta cluster, then expand the Servers node. Unknown macro: {enterprise-feature}
Pending Client TransactionsThe Servers panel in the enterprise version of the Terracotta Administrator Console displays a bar graph showing the number of unacknowledged Terracotta transaction broadcasts for each client in the cluster. You can open a client's panel by double-clicking its bar graph in the Servers panel. This is useful for quickly locating and investigating clients that maintain a large or growing number of unacknowledged Terracotta transaction broadcasts. Such clients could slow down an entire cluster and may need to be disconnected. Server PanelSelecting a specific server's node displays that server's panel, with the Main, Environment, and Config tabs. The Main tab displays the server status and a list of properties, including the server's IP address, version, license, and persistence and failover modes. The Environment tab displays the server's JVM system properties and provides a case-sensitive Find tool. The Config tab displays the Terracotta configuration the server is using and provides a case-sensitive Find tool. Each server node contains two child nodes: Runtime statistics and Thread dumps. Connecting and Disconnecting from a ServerThe Terracotta Administrator Console is connected to all of a cluster's servers when it's connected to the cluster. Being connected to a server means that the console is listening for JMX events coming from that server.
The console is disconnected from a cluster's servers when it's disconnected from the cluster. The console is also disconnected from a server when that server is shut down, but the server still appears as part of the cluster. However, its connection status changes. Servers can be shut down using the stop-tc-server script or the server shutdown button (see below). Unknown macro: {enterprise-feature}
The Server Shutdown Button
Server Connection StatusA Terracotta server's connection status is indicated by a status light next to the server's name. The light's color indicates the server's current connection status. A cluster can have one server, or be configured with multiple servers that communicate state over the network or use a shared file-system. The following figure shows a two-member cluster with an active server named primary and a passive server named secondary in standby mode. The following table summarizes the connection status lights.
Working with ClientsThe Clients node appears under a connected Terracotta cluster in the cluster list. The *Clients panel displays a table of connected clients. The table has the following columns:
To view the list of clients in a cluster, expand the Terracotta cluster, then expand the Clients node. Client PanelWhen a Terracotta client connects to the cluster being monitored by the Terracotta Administrator Console, a client node is created under the Clients node. Selecting a specific client's node displays that client's panel, with the Main, Environment, Config, and Logging tabs. The Main tab displays a list of client properties such as hostname and DSO port. The Environment tab displays the client's JVM system properties and provides a case-sensitive Find tool. The Config tab displays the Terracotta configuration the client is using and provides a case-sensitive Find tool. The Logging tab displays options to add logging items corresponding to DSO client debugging. See the Configuration Guide and Reference for details on the various debug logging options. Connecting and Disconnecting ClientsWhen started up properly, a Terracotta client is automatically added to the appropriate cluster. When a Terracotta client is shut down or disconnects from a server, that client is automatically removed from the cluster and no longer appears in the Terracotta Administrator Console cluster list. Unknown macro: {enterprise-feature}
The Client Disconnect Button
Monitoring Clusters, Servers, and ClientsThe Terracotta Administrator Console provides visual monitoring functions using icons, graphs, statistics, counters, and both simple and nested lists. You can use these features to monitor the overall health of your cluster as well as the health of individual cluster components. Shared ObjectsApplications clustered with Terracotta use shared objects to keep data coherent. Monitoring shared objects serves as an important early-warning and troubleshooting method that allows you to:
The Terracotta Administrator Console provides the following tools for monitoring shared objects:
These tools are discussed in the following sections. Cluster Object BrowsersThe cluster object browser is a panel displaying the shared object graphs for an entire cluster. All of the shared objects in the cluster-wide heap are graphed, but the browser doesn't indicate which clients are sharing them. To see object graphs specific to a client, see the client object browser. To open the cluster object browser, click Cluster object browser under a cluster node. The browser panel displays a running total of the live objects in the cluster. This is the number of objects currently found in the cluster-wide heap; however, this total does not correspond to the number of objects you see in the object graph because certain objects, including literals such as strings, are not counted. These uncounted objects appear in the object graph without an object ID. Each element in an object graph has the following format: fully.qualified.name.of.object (fully.qualified.name.of.object.type) [<number of subelements shown>/<total number of subelements>] [@<object ID] The following are important aspects of the object graph display:
An object browser does not refresh automatically. You can refresh it manually in any of the following ways:
Client Object BrowsersThe client object browser is a panel displaying the shared object graphs for a single client. All of the shared objects known to the client are graphed, but the ones not being shared by the client are grayed out. To open the client object browser, click Client object browser under a client node. The browser panel displays a running total of the live objects in the client. This is the number of objects currently found in the client heap; however, this total does not correspond to the number of objects you see in the object graph because the following types of objects are not counted:
Each element in an object graph has the following format: fully.qualified.name.of.object (fully.qualified.name.of.object.type) [<number of subelements shown>/<total number of subelements>] [@<object ID] The following are important aspects of the object graph display:
An object browser does not refresh automatically. You can refresh it manually in any of the following ways:
Classes BrowserDSO allows for transparent, clustered object state synchronization. To accomplish this feature, some of your application classes will be adapted into new classes that are cluster-aware. The set of all such adapted classes known to the server are displayed in the Classes panel. The Tabular tab show all the adapted classes in a spreadsheet view, including the class name and a count of the number of instances of the class that have been created since the server started. The Tree tab shows a hierarchical, or Java package, view of the adapted classes. Finally, the TreeMap tab shows a presentation making it easy to quickly distinguish the most (and least) heavily used adapted classes. These views are a snapshots of the adapted classes known to the server. You can refresh these values by selecting the Refresh context menu on the Classes node. Object Flush and Fault Rate GraphsObject flush and fault rates are a measure of shared data flow between Terracotta servers and clients. These graphs can reflect trends in the flow of shared objects in a Terracotta cluster. Upward trends in flow can indicate insufficient heap memory, poor locality of reference, or newly changed environmental conditions. For more information, see Object Flush Rate and Object Fault Rate. Cache Miss Rate GraphThe Cache Miss Rate measures the number of client requests for an object that cannot be met by a server's cache and must be faulted in from disk. An upward trend in this graph can expose a bottleneck in your cluster. For more information, see Cache Miss Rate. Runtime Logging of New Shared ObjectsYou can log the creation of all new shared objects by following these steps:
During development or debugging operations, logging new objects may reveal patterns that introduce inefficiencies or errors into your clustered application. However, during production it is recommended that this type of intensive logging be disabled. See the Configuration Guide and Reference for details on the various debug logging options. Runtime StatisticsRuntime statistics provide a continuous feed of sampled real-time data on a number of server and client metrics. The data is plotted on a graph with configurable polling and historical periods. Sampling begins automatically when a runtime statistic panel is first viewed, but historical data is not saved. To record and save historical data, see the Cluster Statistics Recorder. Each runtime statistics panel has the following controls:
To view runtime statistics for an individual server or client, click Runtime statistics under that server or client in the cluster list. Unknown macro: {enterprise-feature}
Cluster Runtime StatsTo view runtime statistics for every server and client in a cluster in one panel, click Cluster runtime stats under that cluster in the cluster list. Specific runtime statistics are defined in the following sections. The cluster components for which the statistic is available are indicated in parentheses. Heap Usage (Cluster, Server, Client)Shows the amount, in megabytes, of maximum available heap and heap being used. CPU Usage (Cluster, Server, Client)Shows the CPU load as a percentage. If more than one CPU is being used, each CPU's load is shown in a separate graph line. Transaction Rate (Cluster, Server, Client)Shows the number of completed Terracotta transactions. Terracotta transactions are sets of one or more clustered object changes, or writes, that must be applied atomically. Cache Miss Rate (Server)Disk-to-server faults occur when an object is not available in a server's in-memory cache. The Cache Miss Rate statistic is a measure of how many objects (per-second) are being faulted from the disk in response to client requests. Objects being requested for the first time, or objects that have been flushed from the server heap before a request arrives, must be faulted in from disk. A high Cache Miss Rate may indicate inadequate memory allocation at the server. Unacknowledged Transaction Broadcasts (Client)Every Terracotta transactions in a Terracotta cluster must be acknowledged by Terracotta clients with in-memory shared objects that are affected by that transaction. For each client, Terracotta server instances keep a count of transactions that have not been acknowledged by that client. The Unacknowledged Transaction Broadcasts statistic is a count of how many transactions the client has yet to acknowledge. An upward trend in this statistic indicates that a client is not keeping up with transaction acknowledgments, which can slow the entire cluster. Such a client may need to be disconnected. Object Flush Rate (Server, Client)The Object Flush Rate statistic is a measure of how many objects are being flushed out of client memory to the Terracotta server. These objects are available in the Terracotta server if needed at a later point in time. A high flush rate may indicate inadequate memory allocation at the client. On a server, the Object Flush Rate is a total including all clients. On a client, the Object Flush Rate indicates only the objects that client is flushing. Object Fault Rate (Server, Client)The Object Fault Rate statistic is a measure of how many objects are being faulted into client memory from the server. A high fault rate may indicate poor locality of reference or inadequate memory allocation at the client. On a server, the Object Fault Rate is a total including all clients. On a client, the Object Fault Rate indicates only the objects that have been faulted to it.
Lock profilerThe Terracotta runtime system can be directed to gather statistics about the distributed locks that are realized as a result of the configuration you have created. Using these statistics can give you insight into the nature of your DSO application, letting you discover the code paths leading to highly-contented access to shared state. The Lock Profiler panel lets you enable/disable lock statistics gathering, specify the lock code path trace depth for those statistics, and query the server to refresh the stats display.
Getting started with lock statistics:
The Locks Profiler panel is comprised of the Client and Server stats pages. The client locks stats are based on the code paths in your DSO client applications that resulted in a lock being taken out. The Server page stats concern the cluster-wide nature of the distributed locks. Each lock has a corresponding identifier, the lock-id. For a named lock the lock-id is the lock name. For an autolock the lock-id is the server-generated id of the object on which that lock was taken out. An example of an autolock id is @1001. That autolock id corresponds to the shared object upon which distributed synchronization was carried out. You can use the object browser to view the state of shared object @1001. A single lock-expression in the configuration can result in the creation of multiple locks by the use of wildcard patterns. A single lock may be arrived at through any number of different code paths. A trace depth of zero means that you don't care how the lock was arrived at, but by increasing the trace depth the client-side lock stats are broken out by the corresponding number of Java call-stack frames. For example, there may be 3 different call sequences that resulted in a particular lock being granted, with one of the paths rarely entered and another responsible for the majority of those lock grants. By setting the trace depth appropriately you can gain insight into the behavior of your application and how it may affect the performance of your clustered system. Note that trace stack frames will only include Java source line numbers if the code was compiled with debugging enabled, commonly done by passing the javac command the -g flag and in Ant by defining the javac task with the debug="true" attribute. The Main ControlsEnable Lock ProfilingUse this control to turn lock profiling on or off. Trace DepthThe trace depth control sets the number of client call-stack frames that are analyzed per lock event to record lock statistics. A depth of 0 gathers lock statistics without regard to how the lock event was arrived at. A lock depth of 1 means that one call-stack frame will be used to disambiguate different code paths when the lock event occurred. A lock depth of 2 will use two frames, and so on. So for example, with a setting of 1, all locks will appear be recorded together, regardless of the call path, because the stack depth analyzed will always be just the method that resulted in the lock event to occur (in other words the surrounding method). However, beginning with a lock depth of 2, different call paths can be separated, since not only the surrounding method, but also the calling method, will be used to record different lock statistics. This means that for example, when the trace depth setting is set to 1, a lock event that occurs within method Foo() will record all lock events occurring within Foo() as one single statistic, e.g. the number of lock requests, average held time and so on. But, with a trace depth setting of 2, the callers of Foo(), say Bar1() and Bar2(), will also be considered, so that a call path of Bar1() -> Foo() will be recorded separately from Bar2() -> Foo(). RefreshUse this button to get an updated view of the statistics gathered so far. The Data Table
Lock Element DetailsThe bottom portion of the Clients page displays detail on the selected lock element. The currently selected lock trace is shown on the left and the configuration element responsible for the creation of the selected lock is shown on the right. A special note has to be made with respect to the Server lock stats. The Terracotta system employs the concept of greedy locks to help improve performance by limiting unnecessary lock hops. Once a client has been awarded a lock it is allowed to keep that lock until another client requests it. The assumption is that once a client obtains a lock it will likely want to request that same lock again shortly. So a lock is handed out to a client until such a time that a different client wants that lock. So if your cluster consisted of a single node that manipulated a single shared object repeatedly, the Server lock requests would be 1 until another client entered the cluster and began manipulating that object. It is likely that when you see Server stats that are undefined (na) it is due to the nature of greedy locks. Distributed Garbage CollectionObjects in a DSO root object graph may become unreferenced and no longer exist in the Terracotta client's heap. These objects are eventually marked as garbage in a Terracotta server instance's heap and from persistent storage by the Terracotta Distributed Garbage Collector (DGC). The DGC is unrelated to the Java garbage collector.
To view a history table of DGC activity in the current cluster, click Distributed garbage collection in the cluster list. The history table is automatically refreshed each time a collection occurs. Each row in the history table represents one distributed garbage collection cycle, with the following columns:
Thread DumpsYou can get a snapshot of the state of each server and client in the Terracotta cluster using the Terracotta Administrator Console thread dumps feature. Every Terracotta cluster, server, and client in the cluster list has a thread-dumps panel. Each panel has a list logging completed thread dumps, and a pane to display the contents of a selected thread dump. Each panel also provides a case-sensitive Find tool for searching through the currently displayed thread dump. Cluster Thread DumpsTo generate, view, and save a cluster-wide thread dump:
Server Thread DumpsTo generate and view a thread dump for any server process:
Client Thread DumpsTo generate and view a thread dump for any client process:
Recording and Viewing StatisticsCluster Statistics RecorderThe Cluster Statistics Recorder panel can generate recordings of selected cluster-wide statistics. This panel has controls to start, stop, view, and export recording sessions. You can use the Snapshot Visualization Tool to view the recorded information. For definitions of available statistics Cluster Statistics Definitions. To learn about configuring the Statistics Recorder, using its command-line interface, and more, see The Terracotta Cluster Statistics Recorder.
Snapshot Visualization ToolThe Snapshot Visualization Tool (SVT) provides a graphical view of cluster information and statistics. The view is created using data recorded with the Statistics Recorder. The SVT is provided as a TIM, called tim-svt, which you can install from the TIM Update Center. Once the SVT is installed, start (or restart) the Terracotta Administrator Console and confirm that the View button on the Cluster Statistics Recorder is enabled. No changes to the Terracotta configuration file are necessary when you install the SVT.
SVT controls are defined below. Import...Load a saved cluster statistics recording that was saved to file. Clicking Import... opens a standard file-selection window for locating the file. Retrieve...Find recorded sessions on active Terracotta servers. Clicking Retrieve... opens a dialog to enter the server address in the format <server_ip_address:JMX_port> or <server_name:JMX_port> as defined in tc-config.xml. Once connected to the server, available recorded sessions appear in the SVT Sessions menu. SessionsSelect a recorded session from the Sessions drop-down menu to load it. Sessions lists the recorded sessions last retrieved from a Terracotta server. Graph heightsScale the y-axis on the graphs displayed in the SVT. Moving the slider to the left shrinks the graph height, while moving it to the right grows the graph height. Server, Client, and Statistics checkboxesChoose which servers and clients have their recorded statistics displayed in the graphs. For each server and client, choose which statistics are displayed in the graphs. Unknown macro: {enterprise-feature}
Backing Up Shared DataUsing the Terracotta backup feature, you can create a backup of the data being shared by your application. To access the backup feature, choose Backup database for the cluster whose servers you want to back up. The Backup database panel appears.
To perform a backup, click Backup DB. A dialog box appears where you can confirm the backup destination directory or enter a new destination. The backup is saved to the directory objectdb at the destination. The database is always backed up to a directory called objectdb, which is automatically created if it does not exist at the destination.
To change the default backup directory path, edit the <data-backup> property in the Terracotta server's configuration file with the path to your preferred backup directory: <server> Enabling BackupsIf a Terracotta server is not configured for permanent-store persistence, the Backup DB button is disabled and the following message appears on the Backup database panel: Backup feature is currently disabled because the cluster is operating in {{temporary-swap-only}} persistence mode.
To enable the Backup DB button, change the value of the persistence mode property in the Terracotta server's configuration file to permanent-store: <server>
<dso>
<persistence>
<mode>permanent-store</mode>
</persistence>
</dso>
</server>
Restoring a BackupTerracotta maintains a copy of shared in-memory data on disk. In most server-failure cases, Terracotta automatically restores that shared data by loading it from the copy and your application state is recreated. However, if you encounter a situation in which the data files are missing, you can restore them from backups. To restore data files from a backup:
Update CheckerOn a bi-weekly basis the Terracotta Administrator Console will check, when first started, on updates to the Terracotta platform. By default, a notice informing you that an update check is about to be performed is displayed, allowing to ignore the immediate check, acknowledge and allow the check, or to disable further checking. Should the update check be allowed, the Terracotta Administrator Console will query the OpenTerracotta website (www.terracotta.org) and report on any new updates. Should the update checker feature be disabled, it can always be re-enabled via the Help|Update Checker... menu. Appendix: Definitions of Cluster StatisticsThe following categories of cluster information and statistics are available for viewing and recording in the Terracotta Cluster Statistics Recorder.
cache objects evict requestThe total number of objects marked for eviction from the l1, or from the l2 to disk. Evicted objects are still referenced, and can be faulted back to the l1 from the l2 or from disk to l2. The SVT graphs this metric for each l1 and l2 separately. High counts imply that memory may be running low. cache objects evictedThe number of objects actually evicted. If this metric is not close in value to cache objects evict request, then memory may not be getting freed quickly enough. l1 l2 flushThe object flush rate when the l1 flushes objects to the l2 to free up memory or as a result of GC activity. l2 faults from diskThe number of times the l2 has to load objects from disk to serve l1 object demand. A high faulting rate may indicate an overburdened l2. l2 l1 faultThe number of times an l2 has to send objects to an l1 because the objects do not exist in the l1 local heap due to memory constraints. Better scalability is achieved when this number is lowered through improved locality and usage of an optimal number of JVMs. memory (usage)The amount of memory (heap) usage over time. vm garbage collectorThe standard Java garbage collector's behavior, tracked on all JVMs in the cluster. distributed gc (distributed garbage collection, or DGC)The behavior of the Terracotta tool that collects distributed garbage on an l2. The DGC can pause all other work on the l2 to ensure that no referenced objects are flagged for garbage collection. l2 pending transactionsThe number of Terracotta transactions held in memory by a Terracotta server instance for the purpose of minimizing disk writes. Before writing the pending transactions, the Terracotta server instance optimizes them by folding in redundant changes, thus reducing its disk access time. Any object that is part of a pending transaction cannot be changed until the transaction is complete. stage queue depthThe depth to which visibility into Terracotta server-instance work-queues is available. A larger depth value allows more detail to emerge on pending task-completion processes, bottlenecks due to application requests or behavior, types of work being done, and load conditions. Rising counts (of items in these processing queues) indicate backlogs and may point to performance degradation. server transaction sequencer statsStatistics on the Terracotta server-instance transaction sequencer, which sequences transactions as resources become available while maintaining transaction order. network activityThe amount of data transmitted and received by a Terracotta server instance in bytes per second. l2 changes per broadcastThe number of updates to objects per broadcast message (see l2 broadcast count). message monitorThe network message count flowing over TCP from Terracotta clients to the Terracotta server. l2 broadcast countThe number of times that a Terracotta server instance has transmitted changes to objects. This "broadcast" occurs any time the changed object is resident in more than one Terracotta client JVM. This is not a true broadcast since messages are sent only to clients where the changed objects are resident. l2 transaction countThe number of Terracotta transactions being processed (per second) by a Terracotta server instance. l2 broadcast per transactionThe ratio of broadcasts to Terracotta transactions. A high ratio (close to 1) means that each broadcast is reporting few transactions, and implies a high co-residency of objects and inefficient distribution of application data. A low ratio (close to 0) reflects high locality of reference and better options for linear scalability. system propertiesSnapshot of all Java properties passed in and set at startup for each JVM. Used to determine configuration states at the time of data capture, and for comparison of configuration across JVMs. thread dumpDisplays a marker on all statistics graphs in the SVT at the point when a thread dump was taken using the Cluster statistics recorder. Clicking the marker displays the thread dump. disk activityThe number of operations (reads and writes) per second, and the number of bytes per second (throughput). The number of reads corresponds to l2 faulting objects from disk, while writes corresponds to l2 flushing objects to disk. The SVT graphs the two aspects separately. cpu (usage)The percent of CPU resources being used. Dual-core processors are broken out into CPU0 and CPU1. More InformationMore information on the following topics is available in other Terracotta documentation: |
Admin Console Guide
(None)
(help button) appears in the Terracotta Administrator Console. Click 

– Starts data polling again (reset Pause).
– Pauses data polling and graphing.
– Deletes existing historical data from the graphs.