Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: Neptune transaction support #10823

Closed
1 task done
tsanton opened this issue May 15, 2024 · 17 comments
Closed
1 task done

feature request: Neptune transaction support #10823

tsanton opened this issue May 15, 2024 · 17 comments
Assignees
Labels
aws:neptune Amazon Neptune status: backlog Triaged but not yet being worked on type: feature New feature, or improvement to an existing feature

Comments

@tsanton
Copy link

tsanton commented May 15, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Feature description

Basic transactions were supported in Tinkerpop 3.7.0 by introducing a ´TinkerTransactionGraph´ class. The config for it can be found here.
I'm hoping it's possible for Localstack Neptune to support transactions.

🧑‍💻 Implementation

I know AWS support Gremlin 3.6.2 <= x <= 3.6.5 at the moment, and that Localstack is running a Gremlin Server behind the scenes. In terms of implementation I'm wondering if it's possible to implement the server config that's linked above by copying the 3.7.X jar and pointing to that transaction class? It would be extremely nice to be able to test locally with transactions, even when limited to single thread processing.

Anything else?

No response

@tsanton tsanton added status: triage needed Requires evaluation by maintainers type: feature New feature, or improvement to an existing feature labels May 15, 2024
@localstack-bot
Copy link
Collaborator

Welcome to LocalStack! Thanks for reporting your first issue and our team will be working towards fixing the issue for you or reach out for more background information. We recommend joining our Slack Community for real-time help and drop a message to LocalStack Pro Support if you are a Pro user! If you are willing to contribute towards fixing this issue, please have a look at our contributing guidelines and our contributing guide.

@tsanton tsanton changed the title feature request: Neptune support transactions feature request: Neptune transaction support May 15, 2024
@Anze1508 Anze1508 added aws:neptune Amazon Neptune status: backlog Triaged but not yet being worked on and removed status: triage needed Requires evaluation by maintainers labels May 15, 2024
@cloutierMat cloutierMat self-assigned this May 15, 2024
@cloutierMat
Copy link
Contributor

Hi @tsanton,

I have looked into the issue. While there appears to have been more changes to the overall gremlin server that makes the simple lifting of a couple .jar files more complicated than it seems, there might be another solution that could work in the interim.

We could create a configuration variable like NEPTUNE_ENABLE_TRANSACTION=1 that would essentially make localstack disregard the Neptune engine version provided at creation and use 3.7 instead. This would have the added benefit of also adding support for property(set, "prop1", "prop2").

Looking at the release notes, I am not seeing any major deprecation going from 3.6 to 3.7. The other major difference is listed under Properties on Elements and seems to have a configurable work around.

Let me know if you see any other complications I am missing.

@tsanton
Copy link
Author

tsanton commented May 16, 2024

Hi @cloutierMat and thank you for your expedient reply!

Personally I would love that solution. Though I'm not a Graph expert, I'd say I'm a layman and an enthusiast. With those restrictions in mind I can think of one potential issue: though deprecations are not a major issue, improvements on existing functionality from 3.6.x to 3.7.x can present behaviour problems (specifically when it comes to MergeV and MergeE). This GitHub issue explains one potential pitfall in depth. In short it's related to the option of specifying property cardinality during merge (can do in 3.7, not possible in 3.6).

If we implement the solution you suggest I would imagine the Localstack MergeE/MergeV behaviour with property cardinality would be 👍, whereas if you fired this at AWS Neptune you'd get a solid 👎. In my view I much prefer the (single thread) transaction support with a LocalStack note "FYI: Do not...." over the non-transactional server as is. Further I'm willing to bet a beer and a nacho (if you're ever somewhere in mid to north Europe) that Neptune 1.3.2.0 (due in x-3? months) will support >=3.7.x so this sort of ameliorates the issue. (Even)Further AWS Neptune docs clearly state: "[test your upgrades on real life instances before you roll out your new bundle of joy]" which gives us some leeway when implementing this behaviour change.

Thanks for you time and support @cloutierMat!

/T

@cloutierMat
Copy link
Contributor

Thanks for pointing that out! I will keep an eye for it, and document my findings in the docs.

@cloutierMat
Copy link
Contributor

Hi @tsanton,

The feature has now been implemented and you can pull the latest docker image to start using it. Information on how to enable it is in our Documentation.

When creating your Neptune instance you should see the following message in the logs indicating that transactions are enabled.

localstack-samples  | 2024-05-30T20:53:35.809  WARN --- [-functhread4] l.s.neptune.packages       : NEPTUNE_ENABLE_TRANSACTION flag is set. Ignoring 'engine-version' for version '3.4.11' and installing: '3.7.2'

As far as the MergeV/MergeE behaviour, from the few samples I have tried, 3.6.2 queries seem to be working as expected. Let us know how it goes.

@tsanton
Copy link
Author

tsanton commented May 31, 2024

Hi @cloutierMat, and thanks for the swift implementation!

I'll get on to refactoring my fixture over the weekend/early next week. I'll get back to you with the results when that is done.

Have a great weekend when that time comes :)

/T

@tsanton
Copy link
Author

tsanton commented Jun 6, 2024

Hi again @cloutierMat!

I've now had the time to look over it.
In short I have 328 tests of whom 326 came out green after swapping out the container and enabling transactions.
From these two failing tests (of whom one should fail) I've found one potential issue and one behaviour I'm not certain about.

Firstly the potential issue.

I have two verticies (V1 and V2) joined by an edge (LinkedE): V1 -> LinkedE -> V2.

This LinkedE has one property: Active (T/F). This property is being updated depending on, you guessed it: the relationship being active or not. The issue here, to me, seems to be the default cardinality. Now when I run the update test I get a failing test and I query out this result (edge properties):

{
          "key": "Active",
          "value": true,
          "metaPropertyOf": null,
          "id": "74aca9b1-efd3-4a9f-b608-4da20ece9321",
          "cardinality": "list",
 },
{
          "key": "Active",
          "value": false,
          "metaPropertyOf": null,
          "id": "405cee5c-3efc-4f83-8384-e959e1a87064",
          "cardinality": "list",
},

Though this docs is for vertex cardinality, is states that only single or set is supported. I think it's safe to assume that it would be the same for edges?

Unfortuntely I'm only in dev so I don't have neptune instance up just yet so I can't verify this.

The next "odd" behaviour is related to "implicit" transactions.
I had a test fail that previously passed: MergeV then MergeE but otherV does not exist, and I wanted all to fail and rollback. This is indeed the behaviour I'm seeing now, but it was happening before I set SupportsTransaction() to return true.

var g = await traversalFactory.GetClient(tenantId);
var tx = g.Tx();
if (traversalFactory.SupportsTransaction()) g = tx.Begin();
........
if (hasTx && tx is not null)
{
    await query.Promise(traversal => traversal.Iterate(), ct);
    await tx.CommitAsync(ct);
}
else
{
    await query.Promise(traversal => traversal.Iterate(), ct);
}

With other words it seems that an entire batch of steps is being given the transactional treatment (i.e. all or nothing). I'm then guessing that the explisit transaction usage is for multiple iterates? Further: is this in accordance with the behaviour of Neptune?

@cloutierMat
Copy link
Contributor

Hi @tsanton,

Awesome that most of your tests are passing now 🔥 🚀

I am a little confused by the errors you are seeing and I would love for you to give me more details on your setup and sample queries to reproduce. But here are my thoughts for the 2 points you bring up.

Edge Property Cardinality

I am not sure how you could get a list Cardinality on an Edge property as it doesn't seem to be supported by the Apache Gremlin language or by AWS Neptune.

See Apache Docs. Point two states For vertices, a cardinality can be provided for vertex properties.. Both implementations seem to only support Single and no metaproperties can be added.

Please provide a sample code as to how you are achieving this! 😄

Implicit Transaction

Let me know if I understand you correctly. you are performing something like.

t = g.add_v("person").as_("p")
t.add_e("knows").from_("p").to("invalidId")
result = t.iterate()

The expected behaviour is that the user will not be created as each traversal is creating a transaction where all of the individual steps must resolve. A traversal will involve all the steps before a terminal step.

So transaction are only making a difference when multiple terminal steps have occurred in the transaction. See last example on this page.

@tsanton
Copy link
Author

tsanton commented Jun 6, 2024

In terms of the Implicit Transaction we're in agreement then: each traversal is implicitly creating a transaction and then it's all or nothing. This is the behaviour I want, but didn't have in <= 3.6.5. All good (just had to ask as I wasn't sure).

In terms of the second error I partially have to say sorry. I was way to quick to jump the gun (having debugged the issue for 30 minutes on top of being jetlagged).
So, you are right: the active property is a property that is set on the vertex, not on the edge. Been a little while since I last fiddled with this so, again.. I blaim jetlag.

To re-summarize the issue: in short this is what I'm doing:

g.mergeV(System.Collections.Generic.Dictionary`2[System.Object,System.Object])
	.option(onCreate, Gremlin.Net.Process.Traversal.GraphTraversal`2[System.Object,System.Object])
	.as('source')
.mergeV(System.Collections.Generic.Dictionary`2[System.Object,System.Object])
	.option(onCreate, System.Collections.Generic.Dictionary`2[System.Object,System.Object])
	.option(onMatch, System.Collections.Generic.Dictionary`2[System.Object,System.Object])
	.as('target')
.mergeE(System.Collections.Generic.Dictionary`2[System.Object,System.Object])
	.option(onCreate, System.Collections.Generic.Dictionary`2[System.Object,System.Object])
	.option(outV, Gremlin.Net.Process.Traversal.GraphTraversal`2[System.Object,System.Object])
	.option(inV, Gremlin.Net.Process.Traversal.GraphTraversal`2[System.Object,System.Object])

First MergeV is the root:

  • OnCreate: has a __.fail() step attached to it (throw if not exist)

Second MergeV has a

  • OnCreate: Insert MapOfAllElements (of whom single cardinality active is one of them)
  • OnMatch: Update MapOfSomeElements(of whom single cardinality active is the only property)

The finally it's the MergeE:

  • OnCreate: insert (implisit OnMatch continue)
  • From 'source'
  • To 'target'

In this case it's the second MergeV that "blows up" because (I think) the MergeV step assumes a default cardinality of list (and we're not able to specify cardinality in Merges < 3.7.x which Neptune does not yet support).

A fix here, I think, would be to set the gremlin.tinkergraph.defaultVertexPropertyCardinality to single (or set if possible) aligning it with Neptune who does not support "list"? Default vertex cardinality config ref. can be found here

Hope this is a bit more enlightening, but as I've only had half a night of sleep I do reserve the right to (though trying to avoid) making another blunder :)

/T

@cloutierMat
Copy link
Contributor

cloutierMat commented Jun 6, 2024

Glad to see transactions are working as expected.

As for the cardinality issue. I have pushed another fix last Friday that set the default Cardinality to Set, by default gremlin-server was Single. It is expected that on retrieval you would see list, as the list/set difference is only present when updating a property. It is a weird behaviour but consistent with Neptune.

And be aware, when using mergeV and updating a property in the option step, Neptune will always use it's default of Set, as you cant specify Cardinality. Which is what you are now observing. This change you are seeing is actually a fix to bring more parity with Neptune 😄

edit: the default behavior is set, which is what you are observing. The 3.6 syntax to specify Cardinality is still valid.

g.add_v("p").property("name", "john").next()
g.merge_v({T.label: "p", "name": "john"}).option(
    Merge.on_match,
    __.side_effect(__.property(Cardinality.single, "name", "john2")).constant({}),
).next()
# results in
{<T.id: 1>: 'b934895b-ffee-40c7-8d35-f55a982a6c90', <T.label: 4>: 'p', 'name': ['john2']}

@tsanton
Copy link
Author

tsanton commented Jun 7, 2024

I'm sorry @cloutierMat but this you're going to have to run by me one more time.
My impression is that Neptune does not support lists (i.e. all is set/single). Therefore I should expect all my updates to not to duplicate properties (i.e. Active = True & Active = False), but a single property that has been updated?

This is not the behaviour I'm seing at the moment -> I'm having exceptions thrown because the serialiser is getting a List when it expects a single bool.

To be clear: I'm not observing set (with [{"Active": <updated>}], I'm observing list ([{"active": true}, {"active": false] (which Neptune should not support?).

Further, my understand ing that in 3.6.x syntax you cannot specify cardinality -> this was first introduced in 3.7.x here (though a bit down -> look for CardinalityValue)

@cloutierMat
Copy link
Contributor

A set is a list with no repeated element. It only matters when adding property to an element.

When querying, your access pattern will determine if you get a string or a list. ie. .by() will always return a list, but can be of only one element. .by(unfold()) will always return a string even if there are multiple element for that property.

3.7 introduce CardinalityValue for use in this pattern {"foo": CardinalityValue.single("bar")}
3.6 cardinality has to be done property(Cardinality.single, "foo", "bar"). Which can't be done in an option step, unless you are using a side effect.

The below examples return the same value for ls-3.6, ls-transaction enabled and Neptune-1.3.0.0

vertex = g.addV("player").property("card", "ace")
g.merge_v({T.id: vertex.id})
.option(Merge.on_match, __.side_effect(property("card", "king")).constant({}))
.valueMap("card")
.next()
## "card": ["ace", "king"]

vertex = g.addV("player").property("card", "ace")
g.merge_v({T.id: vertex.id})
.option(Merge.on_match, __.side_effect(property(Cardinality.set, "card", "king")).constant({}))
.valueMap("card")
.next()
## "card": ["ace", "king"]

vertex = g.addV("player").property("card", "ace")
g.merge_v({T.id: vertex.id})
.option(Merge.on_match, __.side_effect(property(Cardinality.single, "card", "king")).constant({}))
.valueMap("card")
.next()
## "card": ["king"]

@tsanton
Copy link
Author

tsanton commented Jun 7, 2024

Okay, many thanks @cloutierMat :)

So, just to reiterate: when using the MergeV/E-pattern Neptune will assume default cardinality (when not specified, and specifications are not possible < 3.7) which is set.

So what I'm seeing now on my MergeV with update options:

This

{
    "id": "3a7937fa-8bb8-471c-8c04-b314f90c1ce2",
    "label": "UnitV",
    "properties": [
      {
        "key": "LastModified",
        "value": "Sat Jun 08 05:55:50 JST 2024",
        "metaPropertyOf": null,
        "id": "8a5b9849-c3df-4012-88cd-f9a113392d9f",
        "cardinality": "single",
      },
      {
        "key": "Active",
        "value": true,
        "metaPropertyOf": null,
        "id": "0a64bba4-a530-4197-963c-b90cd8df8115",
        "cardinality": "single",
      },
      {
        "key": "TenantId",
        "value": "f434182a-f6e3-41e2-9180-c017d8242352",
        "metaPropertyOf": null,
        "id": "41dfa80c-779e-4786-9c5e-ac03a8757d95",
        "cardinality": "single",
      }
    ]
  }

turning into this:

{
  "id": "3a7937fa-8bb8-471c-8c04-b314f90c1ce2",
  "label": "UnitV",
  "properties": [
    {
      "key": "LastModified",
      "value": "Sat Jun 08 05:55:50 JST 2024",
      "metaPropertyOf": null,
      "id": "8a5b9849-c3df-4012-88cd-f9a113392d9f",
      "cardinality": "list",
    },
    {
      "key": "LastModified",
      "value": "Sat Jun 08 06:00:21 JST 2024",
      "metaPropertyOf": null,
      "id": "c3e6f5f7-c865-4b66-bb01-e88c35f26d55",
      "cardinality": "list",
    },
    {
      "key": "Active",
      "value": true,
      "metaPropertyOf": null,
      "id": "0a64bba4-a530-4197-963c-b90cd8df8115",
      "cardinality": "list",
  	},
    {
      "key": "Active",
      "value": false,
      "metaPropertyOf": null,
      "id": "c3be8e5d-5a2b-470d-9db0-1981289cb590",
      "cardinality": "list",
    },
    {
      "key": "TenantId",
      "value": "f434182a-f6e3-41e2-9180-c017d8242352",
      "metaPropertyOf": null,
      "id": "41dfa80c-779e-4786-9c5e-ac03a8757d95",
      "cardinality": "single",
    },
  ]
}

is the expected behavour in Neptune?

If that the case then guess I'm going to have to rewrite my "merger" to run a OnUpdate SideEffect with property cardinality instead :)

Thanks for your great effort enhancing Localstack Neptune -> it's a major enhancement having a more on pair behaviour from the local emulator!

@tsanton
Copy link
Author

tsanton commented Jun 7, 2024

On a side note @cloutierMat, have you guys heard anything about (or have an internal estimate) for when 3.7.x will be supported by AWS? I will be darned: 1.3.2.0 released at 2024-06-06!

Christmas came early this year :)

@cloutierMat
Copy link
Contributor

cloutierMat commented Jun 7, 2024

Awesome I will be working on support for it hopefully next week. It should be fairly straight forward, but some patches will need updating as we currently provision 3.7.2 for transaction and the new Neptune engine supports up to 3.7.1.

As for your example, it seems to me like you are on the right path. I remember it being the major issue with MergeV which was meant to be a fix for the coalesce workaround to perform upsert, but itself needed a workaround with sideEffect to work properly 😁

On a side note, the cardinality isn't normally part of a gremlin valueMap, is it a side effect of your internal serializer?

@tsanton
Copy link
Author

tsanton commented Jun 8, 2024

I just spent the last hours refactoring the solution and I just made 328/328 green ones with the usage of Cardinality:

if (value.GetType().IsGenericType && value.GetType().GetGenericTypeDefinition() == typeof(List<>))
{
    vertexDetails.Add(property.Name, CardinalityValue.Set(value));
}
else
{
    vertexDetails.Add(property.Name, CardinalityValue.Single(value));
}

There's also a little tweak that must be done if it's a collection (i.e. set above): As I'm sure you know you can only add the first value in the merge and then must add additional values by the .property logic:

properties.Where(p => p.Value is CardinalityValue val && Cardinality.Set.Equals(val.Cardinality))
                    .SelectMany(p => (((CardinalityValue)p.Value).Value as IList ?? new List<object>()).Cast<object>()
                        .Select(value => new { p.Key, Value = value }))
                    .ToList()
                    .ForEach(set => query.Property(Cardinality.Set, set.Key, set.Value));

It's not beautiful, but it works as a charm!

In terms of you side note, I'm on northern Europes thinest ice trying to answer this so.. Take it with a bucket of salt.
I do believe the dotnet serializer to be ignorant of cardinality as I have to do the "manual" conversion for collections that you see above. I know the deserialiser to be oblivious of everything as it simply returns me an IList<Dict<string,object>?>? that I run through a json serialize&deserialize in order to convert it into defined classes.

Hope that answers your question :)

All in all I'm a happy camper and I guess this issue can be closed!

/T

@cloutierMat
Copy link
Contributor

Awesome, I am glad all is working good for you, and thank you for the extra insight. I was wondering where that cardinality property was coming from, it makes more sense now that it comes from your serializer.

I will close this issue and don't hesitate to open a new one should you find more parity issue with Neptune!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws:neptune Amazon Neptune status: backlog Triaged but not yet being worked on type: feature New feature, or improvement to an existing feature
Projects
None yet
Development

No branches or pull requests

4 participants