|
|
Browse by Tags
All Tags » Answers » Service Architecture (RSS)
-
When I create a channel to a service, how do I know when the service is ready to process the data for that channel? A channel doesn't really know what the service is doing. The service might be actively processing the data being sent over the channel. Or, the service might not. There is a constant tension in the system between components that want to push data and components that want to pull data. Components that push data actively work as long as there is data available until a back pressure builds up in the system that resists their ability to push. This back pressure is typically the result of some queue or buffer that has filled up and is no longer able to accept the things being pushed into it. Components that push include several transport and protocol channels, as well as the service dispatcher that pushes messages to the service implementation. Components that pull data remain quiescent even if data is available until there is someone that actually wants to consume some data. Many transport and protocol channels pull messages rather than push. A single implementation may feature both push and pull modes. For example, the TCP channel has a small portion that works in a push mode for connection establishment and transferring some initial data while the application portion of TCP works in a pull mode. Depending on the visibility of these push and pull modes, you might be able to tell from the behavior of a particular protocol what the service is doing. For example, if you're using just TCP, then the initial transmissions needed to open a connection can only be completed once a little bit of pulling has been done on the receiving application side, thus telling you that the service is doing some active processing. On the other hand, if you're using a ReliableMessaging channel or a OneWay channel, then those protocols have a visible running state where they are in push mode. You can't actually tell whether the service is working until you fill up some of the buffers in the protocol stack and start getting push back in the form of rejected messages. That means the service is not working as fast as you're sending data. A queued channel would be an extreme example of push mode. A message queue allows you to push large amounts of data when the service is not even running. Therefore, to know in the general case whether the service is ready to process for you, you need to be able to ask that question to the service rather than to the channel. Next time: Composing Read More...
|
-
You have talked in the past about how a service has both local settings and settings that are shared through policy. How can I transmit all settings through policy to the client? The two types of settings are clearly distinguishable. Shared settings are required to have agreement between the client and server for the two to interoperate. Examples of shared settings are the protocols and formats being used to transmit messages. Local settings are not required to have agreement between the client and server for the two to interoperate. Examples of local settings are the limits for the time and space allowed to process a message. Local settings can not only be in disagreement between the client and server, but they frequently do not make sense to share between the two. The messages sent between the client and server are rarely symmetric. The processing resources available to the client and server are rarely the same. The security concerns of the client and server are rarely in agreement. You can transmit local settings by creating your own policy assertions that both sides implement. This will involve a lot of hassle, particularly if the service wants to have local setting values that are different than those sent in the policy. Finally, the client will need to absolutely trust the service because you are asking the user to run with settings that were supplied by a third party. Why do you want to do this using policy? It seems that if you have such a level of trust with the client, then you probably already have more direct ways of pushing configuration and executables to the client machine. Next time: Enabling Performance Counters Read More...
|
-
How can I speed up message processing when using MSMQ with WCF? For small gains, it is generally possible to eke out a few percentage points of performance by tuning parameters and settings according to the application domain knowledge you have. For large gains, you are likely going to have to think about larger design issues. In particular, a type of design issue that you should consider in your quest for large gains is to question what features are truly needed to build your application. This article is about a few of those feature decisions that you can make when using MSMQ together with WCF. Since these design decisions require building your application with a restricted set of features, there's no guarantee that these techniques are going to be applicable for you. It really just depends on the queue features you need versus the features you don't. I'm not going to mention features not primarily related to the queue, such as whether to use message security, although obviously the same type of analysis can be applied. Use the NetMsmqBinding instead of the MsmqIntegrationBinding. Of the two bindings, NetMsmq is faster than MsmqIntegration in most cases when similar conditions are applied (for example, both running without security). I have an earlier article describing the differences between the two MSMQ bindings . Disable transactional delivery of messages. The ExactlyOnce option controls whether messages are delivered without being lost or duplicated. Making a delivery guarantee requires that the queue be transactional and issue transactions for transfers. If you need best effort rather than exactly once, turning off transactions noticeably improves performance. Disable durable message storage. If you've already turned off transactions, then you can go significantly farther as well. The Durable option controls whether the queue survives restarting the MSMQ service. If you are using the queue to get asynchronous communication rather than reliable delivery, then leaving messages in a volatile store is another noticeable performance improvement. Pack more messages into the same transaction. Often you can't go as far as turning transactions completely off. A less dramatic step is to use the same transaction for multiple receive operations. The TransactedBatchingBehavior allows you to group messages up to a maximum batch size in a single transaction to amortize creating a transaction across multiple receive calls. Next time: Streaming and ToString Read More...
|
-
How do I push back against clients that are tying up the external connections of my service? The amount of service connection resources used by the client can be thought of as a product of two dimensions. The first dimension is the number of connections that the client has open. The second dimension is the length of time that the client holds the connections open. This is a typical time-space product for measuring utilization. Another way to slice the problem might be network link capacity for space and transfer duration for time. The product that you're going to optimize for is problem dependent, but I'll use number of connections for the example. Each of the dimensions has a quota value that we can use to push back against clients. We push back against space usage by throttling the number of concurrent sessions or instances. This really only makes sense if you have some way of identifying a particular client across multiple sessions because otherwise the client can just knock out other competitors to grab more resources. A typical way of identifying the client is by requiring authentication. You could also do some kind of traffic shaping at the network level although that's best done in front of your service rather than on the same machine. We push back against time usage by limiting the connection lifetime (such as through operation and receive timeouts). The general solution to this problem is to identify the factors that make up the product, pick quotas that protect those factors, and then tune quota values. The balance between the factors is another problem dependent piece. For example, you may have to keep the time quota above a certain value due to the latency of the network connections you're using. However, this is just constraining the range for the corresponding space quota that will let you hit your target value for the product. Next time: Initializing the Context Read More...
|
-
How do I construct callbacks to work over a load balancer without affinity? Let's construct a scenario to demonstrate this question. I have three machines; call them X, Y, and Z. X and Y are together behind a network load balancer. This is a server to server communication scenario, where two servers are attempting to talk over a duplex contract. One of the load-balanced servers, X or Y, is going to first act as the client. Pretend that X is the relevant server in this case. X calls a service with a callback contract to Z. At some point in the future, Z is going to respond on that callback to the load-balanced group. If X passed its real address to Z, then Z has no problem making the callback. If X gives the load-balancer address, then Z will sometimes pick X and sometimes pick Y. The load balancer is not affinitized to a particular machine. The interesting case is where we haven't pinned X as the instance to respond to. What can we do to make sure that the request by X is correlated with the response by Z, regardless of whether that response goes to X or Y? Well, either one of two things needs to happen. Z can stuff all of the necessary context information into the response message so that any server could process the response without having to know about the previous conversation. This is essentially turning a stateful problem into a stateless problem that sends a whole bunch more data. This has turned out to be a pretty interesting solution from the HTTP developer front. X and Y can share a common, durable store of correlation information. This is typically a database, but we don't have to be specific about how X and Y share state between themselves. If you picked something in between going totally stateless and having durable state management, then there would be some interesting implications. There would be situations in which the receiving server would need to invent correlation information out of thin air in order to properly interpret the message. You can fake this some of the time, but sooner or later you'll get caught. Next time: Poison Message Handling Read More...
|
-
I'm building an always-on service that gets its messages from a front-end queue. How do I design this service to be scalable? There are two directions to go in when talking about scalability. There's scaling up, handling more messages while continuing to use a single machine, and then there's scaling out, handling more messages by using multiple machines. We'll talk about the two individually even though you may be using a combination of techniques to improve scalability. Scaling up is the simpler of the two although it isn't as interesting to talk about here. The WCF model works very naturally and automatically with scaling up. No matter how powerful the machine gets, it rarely makes sense to replicate an always-on service on a single machine. Rather than duplicating processes, single machine scalability in WCF is achieved by adding more threads to a single process. In other words, you'll get automatic scaling up until the point that you start hitting service model quota limits. This won't take long as the default service model limits are designed for a single processor machine with 10 concurrent clients. However, you can easily in the binding start increasing the number of threads pumping messages and the number of threads processing calls until you hit the hardware limitations of the machine. This is a balancing act to reach a point where resources are never idle, but they're also not overcommitted and facing contention. Scaling out involves replicating the server process to multiple machines. Typically the machines are tied together with network load balancing so that the distribution of load on the farm is controlled by the administrator rather than the user. Each of the machines needs their quotas tuned as above, although homogeneous hardware will let you get away with using a homogeneous binding on all of the machines. An asymmetric farm, with machines of varying capabilities, is really hard to tune. Queues come in to the picture because they generally favor scaling up rather than scaling out. Having a local queue has a number of advantages, including better transaction support unless you happen to be running a Vista or later operating system. However, it is generally much cheaper to scale out than scale up. A quick way to judge which way you need to scale next with your queue is to look at the bottlenecked resource. If you have exhausted the amount of network bandwidth you have moving messages out of the queue, then you should be thinking about moving Read More...
|
-
I'm making multiple calls to a service and all of the other calls sit waiting until the first call completes. How do I make the service process multiple calls at the same time? If a service doesn't accept multiple, simultaneous calls from your client, then you have to figure out whether the problem is actually on the client or service. On the client side, what typically goes wrong here is that the first call blocks all other progress. After the first call completes, you can see multiple calls going out to the service. This is caused by the first call being responsible for opening the connection to the service. You can choose to either open a connection to the service manually by calling Open on the client, or otherwise the first call will automatically open a connection if there isn't one already. Until the connection to the server is established, none of the other calls can proceed and that causes the blocking behavior. The solution is to always open the client connection yourself ahead of time by calling Open so that this process is hopefully complete by the time you want to start making calls. If every call blocks, even after the first one, then this is probably due to the service configuration. There are two knobs that control whether multiple calls are allowed into the service called InstanceContextMode and ConcurrencyMode. You can set both of these knobs through a service behavior. InstanceContextMode controls how often we'll create a new instance. There will only ever be one context if you pick Single and there can be multiple contexts if you pick PerSession or PerCall. PerSession is probably the sweet spot but requires your channel stack to serve up sessionful channels, as from buffered TCP or from reliable messaging. It can be expensive to make your channel stack sessionful if it does not already support sessions from the protocols that you've picked. ConcurrencyMode controls how many threads we'll let into a service instance. There will only ever be one thread per instance if you pick Single or Reentrant, although Reentrant allows a thread to leave from a call and later come back in. There can be more than one thread per instance if you pick Multiple. You really don't have a choice about ConcurrencyMode because you either wrote the service to support multiple threads or you didn't. I don't see a compelling reason for your service to support multiple threads but restrict it to only have one. You should now be able to spot how InstanceContextMode and Read More...
|
-
My web service needs to periodically broadcast messages to clients. The service is an Internet-facing application hosted inside of IIS. What’s the best way to do this? The big limitation in this scenario is that your clients might be behind a firewall and non-addressable. There are basically two architecture camps for broadcasting messages to clients over the Internet. The push architecture camp has the clients maintain a continuous connection to the server and pushes data out at each update. The pull architecture camp has the clients periodically poll the server to see if there’s any new data. Both of these architectures are widely used and they trade latency versus resources off of each other. There are a few other architectures that work locally as well but aren’t as useful over the Internet, such as multicasting and callbacks. I’m just going to pick one of these and talk about using a push architecture. The basic way to build a push architecture is to have clients connect to the server and then the server holds the connection open indefinitely to send messages. If your service is hosted in IIS version 6 or below, then you don’t have a lot of choice about the network protocol. Pushing data from the server to client is difficult with HTTP because the protocol is built on top of the request-reply model. A typical way of using HTTP to push is to make an empty request and send back a response using the chunked transfer encoding. Chunked transfers allow an HTTP server to send the response in pieces without having to specify the total length of the response up front. Normally, the client knows the message is done when the connection is closed or the pre-announced content length is reached. Neither of those options work in this case. Instead, the server needs to define some framing protocol so that the client can tell when the individual messages are done. The easiest way to get chunked HTTP transfers in WCF is to use the HTTP transport with streaming enabled. If your service is hosted in IIS 7 (the Vista/Longhorn version of IIS), then you are able to pick other network protocols for your service, such as TCP. TCP is inherently duplex so it works across the Internet for server-initiated transfer of messages without having to connect back through a firewall. Since TCP is duplex, you can write a duplex contract that gives you a very nice programming model for sending the broadcasts. This is a lot less work on your part than the equivalent setup needed with HTTP. Read More...
|
-
I have two web services and I’m seeing a deadlock when making calls between them. The operation calls are marked as OneWay. How do I fix this? And, how is it even possible for one-way calls to block? Marking an operation with the OneWay attribute doesn’t offer any magical protection against deadlocks. The OneWay attribute means that in messaging terms, there is no application data that the service is expected to return as a result of the call. However, one-way calls still need to wait for the service to accept transmission of the message. A one-way call is complete when the service acknowledges that the message has successfully arrived (although the definition of successful can vary a bit). We’ve seen this before in HTTP, where successful acknowledgement takes the form of a 202 Accepted response . There’s a number of reasons then why a service might appear to have deadlocked. The service could simply be really slow and has either not accepted the connection or not finished processing to the point where it can send the acknowledgement. Now that you know that the one-way call can block, it takes more sleuthing before you can declare that a deadlock is taking place. The operation call could be held in limbo for a while because of the ConcurrencyMode setting. Setting ConcurrencyMode to Single prevents the service from taking a second call in while an existing call is underway. The operation call could also be waiting for the SynchronizationContext to become free. Basically, the problem with the one-way call could be almost any of the problems that you’d associate with a standard bidirectional call. Applying OneWay attributes to your calls does not get you out of having to debug your service deadlocks. Next time: How HostNameComparisonMode Works Read More...
|
|
|
|