minio

Commit Graph

Author	SHA1	Message	Date
Klaus Post	51aa59a737	perf: websocket grid connectivity for all internode communication (#18461 ) This PR adds a WebSocket grid feature that allows servers to communicate via a single two-way connection. There are two request types: * Single requests, which are `[]byte => ([]byte, error)`. This is for efficient small roundtrips with small payloads. * Streaming requests which are `[]byte, chan []byte => chan []byte (and error)`, which allows for different combinations of full two-way streams with an initial payload. Only a single stream is created between two machines - and there is, as such, no server/client relation since both sides can initiate and handle requests. Which server initiates the request is decided deterministically on the server names. Requests are made through a mux client and server, which handles message passing, congestion, cancelation, timeouts, etc. If a connection is lost, all requests are canceled, and the calling server will try to reconnect. Registered handlers can operate directly on byte slices or use a higher-level generics abstraction. There is no versioning of handlers/clients, and incompatible changes should be handled by adding new handlers. The request path can be changed to a new one for any protocol changes. First, all servers create a "Manager." The manager must know its address as well as all remote addresses. This will manage all connections. To get a connection to any remote, ask the manager to provide it given the remote address using. ``` func (m Manager) Connection(host string) Connection ``` All serverside handlers must also be registered on the manager. This will make sure that all incoming requests are served. The number of in-flight requests and responses must also be given for streaming requests. The "Connection" returned manages the mux-clients. Requests issued to the connection will be sent to the remote. * `func (c Connection) Request(ctx context.Context, h HandlerID, req []byte) ([]byte, error)` performs a single request and returns the result. Any deadline provided on the request is forwarded to the server, and canceling the context will make the function return at once. `func (c Connection) NewStream(ctx context.Context, h HandlerID, payload []byte) (st Stream, err error)` will initiate a remote call and send the initial payload. ```Go // A Stream is a two-way stream. // All responses must be read by the caller. // If the call is canceled through the context, //The appropriate error will be returned. type Stream struct { // Responses from the remote server. // Channel will be closed after an error or when the remote closes. // All responses must be read by the caller until either an error is returned or the channel is closed. // Canceling the context will cause the context cancellation error to be returned. Responses <-chan Response // Requests sent to the server. // If the handler is defined with 0 incoming capacity this will be nil. // Channel must be closed to signal the end of the stream. // If the request context is canceled, the stream will no longer process requests. Requests chan<- []byte } type Response struct { Msg []byte Err error } ``` There are generic versions of the server/client handlers that allow the use of type safe implementations for data types that support msgpack marshal/unmarshal.	2023-11-20 17:09:35 -08:00
Harshavardhana	ffd497673f	internode lockArgs should use messagepack (#13329 ) it would seem like using `bufio.Scan()` is very slow for heavy concurrent I/O, ie. when r.Body is slow , instead use a proper binary exchange format, to marshal and unmarshal the LockArgs datastructure in a cleaner way. this PR increases performance of the locking sub-system for tiny repeated read lock requests on same object. ``` BenchmarkLockArgs BenchmarkLockArgs-4 6417609 185.7 ns/op 56 B/op 2 allocs/op BenchmarkLockArgsOld BenchmarkLockArgsOld-4 1187368 1015 ns/op 4096 B/op 1 allocs/op ```	2021-09-30 11:53:01 -07:00
Harshavardhana	069432566f	update license change for MinIO Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-23 11:58:53 -07:00
Anis Elleuch	7be7109471	locking: Add Refresh for better locking cleanup (#11535 ) Co-authored-by: Anis Elleuch <anis@min.io> Co-authored-by: Harshavardhana <harsha@minio.io>	2021-03-03 18:36:43 -08:00
Harshavardhana	9cdd981ce7	fix: expire locks only on participating lockers (#11335 ) additionally also add a new ForceUnlock API, to allow forcibly unlocking locks if possible.	2021-01-25 10:01:27 -08:00
Harshavardhana	d9db7f3308	expire lockers if lockers are offline (#10749 ) lockers currently might leave stale lockers, in unknown ways waiting for downed lockers. locker check interval is high enough to safely cleanup stale locks.	2020-10-24 13:23:16 -07:00
Harshavardhana	eafa775952	fix: add lock ownership to expire locks (#10571 ) - Add owner information for expiry, locking, unlocking a resource - TopLocks returns now locks in quorum by default, provides a way to capture stale locks as well with `?stale=true` - Simplify the quorum handling for locks to avoid from storage class, because there were challenges to make it consistent across all situations. - And other tiny simplifications to reset locks.	2020-09-25 19:21:52 -07:00
Harshavardhana	90cff10e2b	avoid crash if disks are not initialized	2020-09-23 12:00:29 -07:00
Harshavardhana	7ed1077879	Add a custom healthcheck function for online status (#9858 ) - Add changes to ensure remote disks are not incorrectly taken online if their order has changed or are incorrect disks. - Bring changes to peer to detect disconnection with separate Health handler, to avoid a rather expensive call GetLocakDiskIDs() - Follow up on the same changes for Lockers as well	2020-06-17 14:49:26 -07:00
Harshavardhana	ab7d3cd508	fix: Speed up multi-object delete by taking bulk locks (#8974 ) Change distributed locking to allow taking bulk locks across objects, reduces usually 1000 calls to 1. Also allows for situations where multiple clients sends delete requests to objects with following names ``` {1,2,3,4,5} ``` ``` {5,4,3,2,1} ``` will block and ensure that we do not fail the request on each other.	2020-02-21 11:29:57 +05:30
Harshavardhana	720442b1a2	Add lock expiry handler to expire state locks (#8562 )	2019-11-25 16:39:43 -08:00
Harshavardhana	e9b2bf00ad	Support MinIO to be deployed on more than 32 nodes (#8492 ) This PR implements locking from a global entity into a more localized set level entity, allowing for locks to be held only on the resources which are writing to a collection of disks rather than a global level. In this process this PR also removes the top-level limit of 32 nodes to an unlimited number of nodes. This is a precursor change before bring in bucket expansion.	2019-11-13 12:17:45 -08:00
Harshavardhana	4e63e0e372	Return appropriate errors API versions changes across REST APIs (#8480 ) This PR adds code to appropriately handle versioning issues that come up quite constantly across our API changes. Currently we were also routing our requests wrong which sort of made it harder to write a consistent error handling code to appropriately reject or honor requests. This PR potentially fixes issues - old mc is used against new minio release which is incompatible returns an appropriate for client action. - any older servers talking to each other, report appropriate error - incompatible peer servers should report error and reject the calls with appropriate error	2019-11-04 09:30:59 -08:00
Krishna Srinivas	2ab0681c0c	Do not ignore Lock()'s return value (#8142 )	2019-08-28 16:12:57 -07:00
Harshavardhana	b52b90412b	Avoid data-transfer in distributed locking (#8004 )	2019-08-05 11:45:30 -07:00
Harshavardhana	59e1d94770	Remove stale entry spurious logging (#7663 ) The problem in current code was we were removing an entry from a lock lockerMap without considering the fact that different entry for same resource is a possibility due the nature of locks that can be acquired in parallel before we decide if the lock is considered stale A sequence of events is as follows - Lock("resource") - lockMaintenance(finds a long lived lock in this "resource") - Owner node rebooted which now retruns Expired() as true for this "resource" - Unlock("resource") which succeeded in quorum - Now by this time application retried and acquired a new Lock() on the same "resource" - Now that we have Expired() true from the previous call, we proceed to purge the entry from the local lockMap() local lockMap reports a different entry for the expired UID which results in a spurious log entry. This PR removes this logging as this situation is an expected scenario.	2019-05-22 12:21:36 -07:00
kannappanr	d2f42d830f	Lock: Use REST API instead of RPC (#7469 ) In distributed mode, use REST API to acquire and manage locks instead of RPC. RPC has been completely removed from MinIO source. Since we are moving from RPC to REST, we cannot use rolling upgrades as the nodes that have not yet been upgraded cannot talk to the ones that have been upgraded. We expect all minio processes on all nodes to be stopped and then the upgrade process to be completed. Also force http1.1 for inter-node communication	2019-04-17 23:16:27 -07:00

17 Commits