TESTING WITH TERRACOTTA

Added by Tushar Khairnar, last edited by Puneet Bhardwaj on May 20, 2009  (view change)

Labels

 
(None)

Unknown macro: {builder-pagetitle}

Contents

The page Header does not exist.

For some general do's/don't see

How does my functional testing change in the presence of Terracotta?

It dosen't in the sense that BlackBox and WhiteBox testing should proceed as normal. We however require that you ensure maximal code-coverage, so that there are no situations untested where an object reference may join the clustered object graph and then throw a runtime UnlockedSharedException or a NonPortableExcepetion.

How should I performance/stress test with Terracotta. What metrics am I looking for?

  • Measure overall Throughput and Latency of at least a basket of transactions
  • Note to run the test in the presence of extraneous usage such as Admin Console, Occasional SVT recording session and Monitoring infrastrucutre (e.g. JMX Ping etc.)
  • You probably need to monitor class creation, locks, DGC activity and CPU/Memory/Disk/Network on L1s and L2s. Most of these are available via the Terracotta Admin Console. You can augment via nmon and similar tools - see http://www.terracotta.org/confluence/display/wiki/Testing+Distributed+Software

How do I execute on Availability Testing in QA/ Stage?

  • You may or may not want to repeat the infrastructure failure tests, especially if you are mirroring the certified config.

Have clients on the WAN and the LAN - can they both connect to the same terracotta servers?

  • Yes they can
  • However, a WAN client may not relinquish Locks in a timely fashion as compared to a LAN client (depending on what kind of WAN latencies you experience) - therefore LAN clients get awarded more latencies and/or contention.
  • Simiarly, a WAN client may not have such a reliable connection to the Terracotta server. The Terracotta server allows a L1 (client) to reconnect within a certain window (You specify that in tc.properties: Defaults ==> l1.reconnect.enabled=false; l1.reconnect.timeout.millis=5000) after the persistent TCP/IP connection between itself and the L1 is interrupted. Therefore if the WAN connectivity is interrupted for 4s - the remote WAN client may still connect back to the TCServer without issue, but then resources held by the WAN client are blocked across the cluster for 4s.
  • Therefore, allowing WAN clients connectivity ought to be treated with caution.
  • For ReadOnly cases, that allow a bit of asynchronicity - you might be able to clone the state and have WAN clients pick up a copy of state on a periodic basis off a shared datastructure (e.g. a Queue).

TESTING TOOLS:

How can I test WAN clients?

  • Terracotta distribution bundles a primitive proxy+LoadBalancer in the product.
  • Look at com.net.tc.proxy.TCPProxy.proxy in tc.jar (or on SVN if you want to look at the source).
  • Usage is something like this:
    • cd to Terracotta-install/lib
    • host-siyer$ java -classpath tc.jar com.tc.net.proxy.TCPProxy
    • usage: TCPProxy <listen port> <endpoint[,endpoint...]> [delay]
      • <listen port> - The port the proxy should listen on
      • <endpoint> - Comma separated list of 1 or more <host>:<port> pairs to round robin requests to
      • [delay] - Millisecond delay between network data (optional,default: 0)
    • host-siyer$ java -classpath tc.jar com.tc.net.proxy.TCPProxy 9000 <tc-server-host>:9510 500
      • Thu Sep 13 11:04:53 PDT 2007: Starting listener on port 9000, proxying to [server:9510] with 500ms delay
    • So now you point the tc-config server element to the host and port where TCPProxy is running.
    • proxy> help
      • h - this help message
      • s - print proxy status
      • d <num> - adjust the delay time to <num> milliseconds
      • c - close all active connections
      • l - toggle debug logging
      • q - quit (shutdown proxy)
    • You can change the WAN delay at will (e.g. to 7s) after the proxy is running, with something like below.
      • proxy> d 7000
      • proxy> quit

Do you have an automated framework for Testing?

  • Droid is a scripted distributed testing framework.
    • Droid is a minimal framework for starting up, configuring, synchronizing, and collecting statistics about a test run on multiple machines. There are two concepts - the agent, and the worker (both just Java programs running in a JVM). The activities of both are scripted using Groovy, which is a scripting language that runs on the JVM and has many syntax similarities to Java.
    • An agent must be started on every machine that participates in the test. The agent runs a script that can coordinate with other agents (via Terracotta of course) and can choose how and when to start workers. The worker also starts with a Groovy script and performs the actual work of the test. The agent/worker split is necessary to allow things like a test that starts up a cluster, then later adds or removes nodes of the cluster. In that case, the agents run throughout the test but workers may be started or stopped according to the agent's script.
  • Cachetest are the set of tests which use droid as a framework to run across JVMs.
    • To run the cache tests, you will need to grab and build both projects. At runtime, you will be running the droid framework, using wrapper scripts from cache test to properly invoke it using the cache testing code. You can download the tests from the following location http://svn.terracotta.org/svn/forge/projects/cachetest
    • You will need to modify agent.sh according to your environment. The cache tests are designed for distributed testing so you should be testing on multiple machines, however you can run everything on a single machine while in development.
    • The agents use a clustered CyclicBarrier to wait for each other, so the tests don't have to be started at exactly the right time or anything. They will start when the number of agents as specified in the AGENTS variable in agent.sh have arrived at the barrier.

How does one write system integration tests that use terracotta?